{"id":141366,"date":"2026-02-10T19:33:33","date_gmt":"2026-02-10T17:33:33","guid":{"rendered":"https:\/\/www.javacodegeeks.com\/?p=141366"},"modified":"2026-02-10T19:33:35","modified_gmt":"2026-02-10T17:33:35","slug":"working-with-orc-files-in-python","status":"publish","type":"post","link":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html","title":{"rendered":"Working with ORC files in Python"},"content":{"rendered":"<p>Optimized Row Columnar (ORC) is a high-performance columnar file format widely used in big data ecosystems such as Apache Hive, Spark, and Flink. Let us delve into understanding how to work with the ORC file format in Python through a guide with examples.<\/p>\n<h2><a name=\"section-1\"><\/a>1. Introduction to the ORC File Format<\/h2>\n<p><a href=\"https:\/\/orc.apache.org\/docs\/\" target=\"_blank\" rel=\"noopener\">ORC (Optimized Row Columnar)<\/a> is a highly efficient columnar storage format originally developed for Apache Hive and now widely used across the Hadoop ecosystem. It is designed to store and process very large datasets efficiently in distributed systems. Instead of storing data row by row, ORC organizes data column by column and further groups rows into large logical units called stripes. Each stripe contains index data, row data, and a footer with metadata. This layout allows query engines to read only the columns and row ranges required for a given query, significantly reducing I\/O.<\/p>\n<h3>1.1 Key Advantages<\/h3>\n<ul>\n<li>High compression ratios \u2013 ORC supports multiple compression algorithms (such as ZLIB, Snappy, and LZO) and applies compression at the column level, often achieving much better compression than row-based formats.<\/li>\n<li>Faster query performance \u2013 Columnar storage minimizes disk reads and improves CPU efficiency by enabling vectorized execution in query engines like Hive, Presto, and Spark.<\/li>\n<li>Efficient predicate pushdown \u2013 ORC stores per-column statistics (min, max, null count, etc.) that allow query engines to skip entire stripes or row groups that do not satisfy filter conditions.<\/li>\n<li>Built-in indexing and statistics \u2013 Lightweight indexes and rich metadata are embedded directly in the file, eliminating the need for external index structures.<\/li>\n<\/ul>\n<p>Because of these characteristics, ORC is especially well-suited for read-heavy analytical workloads such as log analysis, business intelligence reporting, and data warehousing, where queries frequently scan large datasets but touch only a subset of columns.<\/p>\n<h3>1.2 When Should You Use ORC?<\/h3>\n<p>You should consider using ORC when:<\/p>\n<ul>\n<li>You are working with very large datasets that are primarily read-intensive rather than write-intensive.<\/li>\n<li>Your data is queried using analytical engines such as Apache Hive, Spark SQL, Presto, or Trino.<\/li>\n<li>Queries frequently access only a subset of columns, benefiting from ORC\u2019s columnar storage model.<\/li>\n<li>You want to take advantage of predicate pushdown and built-in column statistics to reduce I\/O.<\/li>\n<li>Efficient compression and reduced storage costs are important for your data pipeline.<\/li>\n<\/ul>\n<h2><a name=\"section-2\"><\/a>2. End-to-End Python ORC Example<\/h2>\n<h3>2.1 Required Dependencies<\/h3>\n<p>To work with ORC files in Python, you need to install the required third-party libraries. The pyarrow library provides native support for reading and writing ORC files, while pandas is used for data manipulation and analysis. Make sure you are using Python 3.8 or later before installing the dependencies.<\/p>\n<pre class=\"brush:plain; wrap-lines:false;\">pip install pyarrow pandas\n<\/pre>\n<h3>2.2 Writing, Reading, and Processing ORC Data in Python<\/h3>\n<pre class=\"brush:python; wrap-lines:false;\">import pyarrow as pa\nimport pyarrow.orc as orc\nimport pandas as pd\n\n\ndef _create_sample_log_data() -&gt; pd.DataFrame:\n    \"\"\"create sample log data including complex types\"\"\"\n    log_data = {\n        \"timestamp\": [\n            \"2026-01-01 10:00:00\",\n            \"2026-01-01 10:01:00\",\n            \"2026-01-01 10:02:00\",\n            \"2026-01-01 10:03:00\",\n        ],\n        \"service\": [\"auth\", \"auth\", \"payment\", \"payment\"],\n        \"level\": [\"INFO\", \"ERROR\", \"ERROR\", \"INFO\"],\n        \"message\": [\n            \"User login successful\",\n            \"Invalid credentials\",\n            \"Payment failed\",\n            \"Payment completed\",\n        ],\n        \"response_time_ms\": [120, 300, 850, 200],\n        \"tags\": [\n            [\"security\", \"login\"],\n            [\"security\", \"auth\"],\n            [\"payment\", \"failure\"],\n            [\"payment\", \"success\"],\n        ],\n        \"metadata\": [\n            {\"ip\": \"10.0.0.1\", \"browser\": \"chrome\"},\n            {\"ip\": \"10.0.0.2\", \"browser\": \"firefox\"},\n            {\"ip\": \"10.0.0.3\", \"browser\": \"safari\"},\n            {\"ip\": \"10.0.0.4\", \"browser\": \"edge\"},\n        ],\n    }\n    return pd.DataFrame(log_data)\n\n\ndef _get_orc_schema() -&gt; pa.Schema:\n    \"\"\"define orc schema with complex data types\"\"\"\n    return pa.schema([\n        (\"timestamp\", pa.string()),\n        (\"service\", pa.string()),\n        (\"level\", pa.string()),\n        (\"message\", pa.string()),\n        (\"response_time_ms\", pa.int32()),\n        (\"tags\", pa.list_(pa.string())),\n        (\"metadata\", pa.struct([\n            (\"ip\", pa.string()),\n            (\"browser\", pa.string()),\n        ])),\n    ])\n\n\ndef _write_orc_file(table: pa.Table, file_path: str) -&gt; None:\n    \"\"\"write pyarrow table to orc with compression\"\"\"\n    with open(file_path, \"wb\") as f:\n        orc.write_table(table, f, compression=\"zlib\")\n\n\ndef _read_orc_file(file_path: str) -&gt; pd.DataFrame:\n    \"\"\"read orc file and return pandas dataframe\"\"\"\n    with open(file_path, \"rb\") as f:\n        orc_file = orc.ORCFile(f)\n        table = orc_file.read()\n    return table.to_pandas()\n\n\ndef _process_logs(df: pd.DataFrame) -&gt; None:\n    \"\"\"perform basic log analysis\"\"\"\n    error_logs = df[df[\"level\"] == \"ERROR\"]\n\n    error_count_by_service = error_logs.groupby(\"service\").size()\n    print(\"\\nError count per service:\")\n    print(error_count_by_service)\n\n    avg_response_time = df.groupby(\"service\")[\"response_time_ms\"].mean()\n    print(\"\\nAverage response time (ms) per service:\")\n    print(avg_response_time)\n\n    print(\"\\nSample complex fields:\")\n    print(df[[\"tags\", \"metadata\"]])\n\n\ndef main() -&gt; None:\n    orc_file_path = \"application_logs.orc\"\n\n    df = _create_sample_log_data()\n    schema = _get_orc_schema()\n\n    table = pa.Table.from_pandas(df, schema=schema)\n\n    _write_orc_file(table, orc_file_path)\n    print(\"ORC file written successfully.\")\n\n    read_df = _read_orc_file(orc_file_path)\n    print(\"\\nData read from ORC file:\")\n    print(read_df)\n\n    _process_logs(read_df)\n\n\nif __name__ == \"__main__\":\n    main()\n<\/pre>\n<h4>2.2.1 Code Explanation<\/h4>\n<p>This Python example demonstrates a complete, production-style workflow for working with the ORC (Optimized Row Columnar) file format using PyArrow and Pandas, structured for clarity through private-style helper methods. The program begins by importing PyArrow for columnar data processing and native ORC support, along with Pandas for in-memory data manipulation.<div style=\"display:inline-block; margin: 15px 0;\"> <div id=\"adngin-JavaCodeGeeks_incontent_video-0\" style=\"display:inline-block;\"><\/div> <\/div><\/p>\n<p>The <code>_create_sample_log_data<\/code> method constructs a realistic application log dataset and returns it as a Pandas DataFrame, including both primitive fields such as timestamps, service names, log levels, messages, and response times, as well as complex data types like a list of tags and a structured metadata object containing IP address and browser information.<\/p>\n<p>The <code>_get_orc_schema<\/code> method explicitly defines the ORC schema using PyArrow, specifying column data types and nested structures to ensure that complex fields are stored correctly and efficiently in the ORC file. In the main execution flow, the DataFrame is converted into an immutable, columnar PyArrow Table using the defined schema, which optimizes the data for serialization and analytical workloads.<\/p>\n<p>The <code>_write_orc_file<\/code> method then writes this table to disk in ORC format using ZLIB compression, reducing storage size while preserving fast read performance through column-level compression. After writing, the <code>_read_orc_file<\/code> method reopens the ORC file in binary mode, reads it using the ORCFile reader, and converts the resulting PyArrow Table back into a Pandas DataFrame for analysis.<\/p>\n<p>Finally, the <code>_process_logs<\/code> method performs analytical operations on the data by filtering ERROR-level log entries, aggregating error counts per service, calculating average response times per service, and demonstrating access to nested complex fields, illustrating how ORC integrates seamlessly with Pandas for downstream analytics.<\/p>\n<p>The <code>main<\/code> function orchestrates this end-to-end flow, while the <code>if __name__ == \"__main__\"<\/code> guard ensures the script can be safely reused as a module, resulting in a clean, readable, and maintainable example of reading, writing, and analyzing ORC data in Python.<\/p>\n<h3>2.3 Program Output<\/h3>\n<pre class=\"brush:plain; wrap-lines:false;\">ORC file written successfully.\n\nData read from ORC file:\n             timestamp   service   level                  message  response_time_ms                tags                               metadata\n0  2026-01-01 10:00:00      auth    INFO   User login successful               120    [security, login]    {'ip': '10.0.0.1', 'browser': 'chrome'}\n1  2026-01-01 10:01:00      auth   ERROR     Invalid credentials               300      [security, auth]   {'ip': '10.0.0.2', 'browser': 'firefox'}\n2  2026-01-01 10:02:00   payment   ERROR            Payment failed              850   [payment, failure]  {'ip': '10.0.0.3', 'browser': 'safari'}\n3  2026-01-01 10:03:00   payment    INFO        Payment completed               200   [payment, success]   {'ip': '10.0.0.4', 'browser': 'edge'}\n\nError count per service:\nservice\nauth       1\npayment    1\ndtype: int64\n\nAverage response time (ms) per service:\nservice\nauth       210.0\npayment    525.0\nName: response_time_ms, dtype: float64\n\nSample complex fields:\n                    tags                               metadata\n0     [security, login]    {'ip': '10.0.0.1', 'browser': 'chrome'}\n1       [security, auth]   {'ip': '10.0.0.2', 'browser': 'firefox'}\n2    [payment, failure]  {'ip': '10.0.0.3', 'browser': 'safari'}\n3    [payment, success]   {'ip': '10.0.0.4', 'browser': 'edge'}\n<\/pre>\n<h2><a name=\"section-3\"><\/a>3. Conclusion<\/h2>\n<p>ORC is a powerful columnar file format designed for high-performance analytics. With Python libraries like PyArrow, working with ORC files is simple and efficient. Understanding ORC usage in Java further helps when building cross-language big data pipelines. If your workload involves large datasets, complex schemas, and analytical queries, ORC is an excellent choice.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Optimized Row Columnar (ORC) is a high-performance columnar file format widely used in big data ecosystems such as Apache Hive, Spark, and Flink. Let us delve into understanding how to work with the ORC file format in Python through a guide with examples. 1. Introduction to the ORC File Format ORC (Optimized Row Columnar) is &hellip;<\/p>\n","protected":false},"author":26931,"featured_media":219,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1878],"tags":[5097,224],"class_list":["post-141366","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python","tag-orc","tag-python"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Working with ORC files in Python - Java Code Geeks<\/title>\n<meta name=\"description\" content=\"How to work with the orc file format in python a guide with examples: Learn how to read, write, and process ORC file format in Python\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Working with ORC files in Python - Java Code Geeks\" \/>\n<meta property=\"og:description\" content=\"How to work with the orc file format in python a guide with examples: Learn how to read, write, and process ORC file format in Python\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html\" \/>\n<meta property=\"og:site_name\" content=\"Java Code Geeks\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/javacodegeeks\" \/>\n<meta property=\"article:published_time\" content=\"2026-02-10T17:33:33+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-10T17:33:35+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/python-logo.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"150\" \/>\n\t<meta property=\"og:image:height\" content=\"150\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Yatin Batra\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:site\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Yatin Batra\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html\"},\"author\":{\"name\":\"Yatin Batra\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/person\\\/cda31a4c1965373fed40c8907dc09b8d\"},\"headline\":\"Working with ORC files in Python\",\"datePublished\":\"2026-02-10T17:33:33+00:00\",\"dateModified\":\"2026-02-10T17:33:35+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html\"},\"wordCount\":788,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/python-logo.jpg\",\"keywords\":[\"orc\",\"Python\"],\"articleSection\":[\"Python\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html\",\"name\":\"Working with ORC files in Python - Java Code Geeks\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/python-logo.jpg\",\"datePublished\":\"2026-02-10T17:33:33+00:00\",\"dateModified\":\"2026-02-10T17:33:35+00:00\",\"description\":\"How to work with the orc file format in python a guide with examples: Learn how to read, write, and process ORC file format in Python\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#primaryimage\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/python-logo.jpg\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/python-logo.jpg\",\"width\":150,\"height\":150},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/working-with-orc-files-in-python.html#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Web Development\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/category\\\/web-development\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Python\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/category\\\/web-development\\\/python\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Working with ORC files in Python\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#website\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\",\"name\":\"Java Code Geeks\",\"description\":\"Java Developers Resource Center\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\"},\"alternateName\":\"JCG\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.javacodegeeks.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\",\"name\":\"Exelixis Media P.C.\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/exelixis-logo.png\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/exelixis-logo.png\",\"width\":864,\"height\":246,\"caption\":\"Exelixis Media P.C.\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/javacodegeeks\",\"https:\\\/\\\/x.com\\\/javacodegeeks\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/person\\\/cda31a4c1965373fed40c8907dc09b8d\",\"name\":\"Yatin Batra\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/Yatin.batra_.jpg\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/Yatin.batra_.jpg\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/12\\\/Yatin.batra_.jpg\",\"caption\":\"Yatin Batra\"},\"description\":\"An experience full-stack engineer well versed with Core Java, Spring\\\/Springboot, MVC, Security, AOP, Frontend (Angular &amp; React), and cloud technologies (such as AWS, GCP, Jenkins, Docker, K8).\",\"sameAs\":[\"https:\\\/\\\/www.javacodegeeks.com\"],\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/author\\\/yatin-batra\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Working with ORC files in Python - Java Code Geeks","description":"How to work with the orc file format in python a guide with examples: Learn how to read, write, and process ORC file format in Python","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html","og_locale":"en_US","og_type":"article","og_title":"Working with ORC files in Python - Java Code Geeks","og_description":"How to work with the orc file format in python a guide with examples: Learn how to read, write, and process ORC file format in Python","og_url":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html","og_site_name":"Java Code Geeks","article_publisher":"https:\/\/www.facebook.com\/javacodegeeks","article_published_time":"2026-02-10T17:33:33+00:00","article_modified_time":"2026-02-10T17:33:35+00:00","og_image":[{"width":150,"height":150,"url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/python-logo.jpg","type":"image\/jpeg"}],"author":"Yatin Batra","twitter_card":"summary_large_image","twitter_creator":"@javacodegeeks","twitter_site":"@javacodegeeks","twitter_misc":{"Written by":"Yatin Batra","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#article","isPartOf":{"@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html"},"author":{"name":"Yatin Batra","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/person\/cda31a4c1965373fed40c8907dc09b8d"},"headline":"Working with ORC files in Python","datePublished":"2026-02-10T17:33:33+00:00","dateModified":"2026-02-10T17:33:35+00:00","mainEntityOfPage":{"@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html"},"wordCount":788,"commentCount":0,"publisher":{"@id":"https:\/\/www.javacodegeeks.com\/#organization"},"image":{"@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#primaryimage"},"thumbnailUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/python-logo.jpg","keywords":["orc","Python"],"articleSection":["Python"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html","url":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html","name":"Working with ORC files in Python - Java Code Geeks","isPartOf":{"@id":"https:\/\/www.javacodegeeks.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#primaryimage"},"image":{"@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#primaryimage"},"thumbnailUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/python-logo.jpg","datePublished":"2026-02-10T17:33:33+00:00","dateModified":"2026-02-10T17:33:35+00:00","description":"How to work with the orc file format in python a guide with examples: Learn how to read, write, and process ORC file format in Python","breadcrumb":{"@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#primaryimage","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/python-logo.jpg","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/python-logo.jpg","width":150,"height":150},{"@type":"BreadcrumbList","@id":"https:\/\/www.javacodegeeks.com\/working-with-orc-files-in-python.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.javacodegeeks.com\/"},{"@type":"ListItem","position":2,"name":"Web Development","item":"https:\/\/www.javacodegeeks.com\/category\/web-development"},{"@type":"ListItem","position":3,"name":"Python","item":"https:\/\/www.javacodegeeks.com\/category\/web-development\/python"},{"@type":"ListItem","position":4,"name":"Working with ORC files in Python"}]},{"@type":"WebSite","@id":"https:\/\/www.javacodegeeks.com\/#website","url":"https:\/\/www.javacodegeeks.com\/","name":"Java Code Geeks","description":"Java Developers Resource Center","publisher":{"@id":"https:\/\/www.javacodegeeks.com\/#organization"},"alternateName":"JCG","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.javacodegeeks.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.javacodegeeks.com\/#organization","name":"Exelixis Media P.C.","url":"https:\/\/www.javacodegeeks.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","width":864,"height":246,"caption":"Exelixis Media P.C."},"image":{"@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/javacodegeeks","https:\/\/x.com\/javacodegeeks"]},{"@type":"Person","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/person\/cda31a4c1965373fed40c8907dc09b8d","name":"Yatin Batra","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/12\/Yatin.batra_.jpg","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/12\/Yatin.batra_.jpg","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/12\/Yatin.batra_.jpg","caption":"Yatin Batra"},"description":"An experience full-stack engineer well versed with Core Java, Spring\/Springboot, MVC, Security, AOP, Frontend (Angular &amp; React), and cloud technologies (such as AWS, GCP, Jenkins, Docker, K8).","sameAs":["https:\/\/www.javacodegeeks.com"],"url":"https:\/\/www.javacodegeeks.com\/author\/yatin-batra"}]}},"_links":{"self":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/141366","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/users\/26931"}],"replies":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/comments?post=141366"}],"version-history":[{"count":0,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/141366\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/media\/219"}],"wp:attachment":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/media?parent=141366"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/categories?post=141366"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/tags?post=141366"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}