0% found this document useful (0 votes)

19 views14 pages

BWP The Pure Data Storage Platform For Ai

The white paper discusses how Pure Storage's platform enhances AI deployments by providing reliable and efficient storage solutions tailored for the AI project lifecycle. It emphasizes the importance of scalable, secure, and high-performance storage infrastructure for AI applications, from data curation to production. Additionally, it highlights Pure Storage's collaboration with NVIDIA and the capabilities of Portworx for managing Kubernetes data services in AI projects.

Uploaded by

Murali Shankar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views14 pages

BWP The Pure Data Storage Platform For Ai

Uploaded by

Murali Shankar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

BUSINESS WHITE PAPER

The Pure Storage

Platform for AI
Pure Storage® accelerates and simplifies AI deployments,
enhancing their value to the enterprise.
BUSINESS WHITE PAPER

With readily available generative artificial intelligence (GenAI), AI has

“The playing field is poised to
become a sine qua non of information technology (IT) operations.
Enterprises in finance, medicine, manufacturing, transportation, security, become a lot more competitive,
and others all realize that AI is now a survival issue for them. Those that and businesses that don’t
use AI to identify trends, make accurate predictions, serve clients faster deploy AI and data to help them
with less effort, and so forth have distinct competitive advantages over innovate in everything they do
those that don’t.
will be at a disadvantage.”
The increasing importance of AI-based solutions makes reliable, PAUL DAUGHERTY, CHIEF TECHNOLOGY
AND INNOVATION OFFICER, ACCENTURE
easy-to-use IT services a must for production deployments. This brief QUOTED IN: HTTPS://WWW.SALESFORCE.COM/
surveys AI needs throughout the project lifecycle (primarily from a BLOG/AI-QUOTES/

storage perspective) and shows how the Pure Storage® portfolio of

storage system, data services, and workflow management products for
Kubernetes promote efficiency both for AI and IT infrastructure teams as
well as developers and MLOps engineers who design, implement, and run
AI applications.

The AI Project Lifecycle

Organizations undertake AI projects to support mission objectives such as,

• More accurate medical diagnoses

• Acceleration of genomic research

• More predictable market fluctuations

• Bank card fraud detection

• Rapid identification of security threats

• etc.

Whatever their objectives, in-house AI projects tend to follow a trajectory

similar to that shown in Figure 1, from conception, development, production,
to evolution. FIGURE 1 AI Project Life Cycle

AI projects generally start with a proposed model (algorithm) and train (refine) it iteratively using steadily increasing amounts
of available or easily acquired input data for which outcomes are known until it reliably produces inferences (outcomes) that
support the mission objective. For example, a medical diagnosis model might be trained using thousands of MRI scans with known
diagnoses. The finished model would then take live scans as input and suggest diagnoses to medical practitioners.

Models are typically envisioned by small groups of data scientists, often in functional or business organizations rather than in
IT teams. Data scientists start with modest IT resources (e.g., public cloud virtual machines and storage) to experiment with
model variations.

Uncomplicate Data Storage, Forever 2

BUSINESS WHITE PAPER

As model refinement progresses, development is typically taken over by

Machine Learning Operations (MLOps) engineering organizations which:
As projects progress, the
intellectual property embodied • Assume on-going model training responsibilities from
in models and the curated data scientists.
training data become increasingly
• Manage the transition of project resources and collateral to IT
valuable. Along with performance
environments with the scalability, performance, and reliability
and reliability, protection against
needed for late-stage training and production.
data loss and theft of intellectual
• Introduce automated workflows that enable self-service training f
property become important
or data scientists and other developers, and ultimately support
considerations.
robust production.

When models produce reliable inferences, they move into production. Production models use live data to produce inferences that
assist with business decisions. As with any mission-critical IT application, both AI models and the environments in which they run
must be stable. Reliable access to a model and the data it needs to function is key to stability.

Models in production are usually monitored for “data drift” to ensure that results continue to meet mission objectives. With GenAI,
open source or proprietary large language models (LLMs) are often combined with retrieval augmented generation (RAG) using
context-specific vector databases that are updated frequently to include new data and remove what has become less relevant.
In some cases, AI models may be completely reimagined as business needs change and/or if new types of training data
become available.

Information Technology in AI Projects

While AI projects typically begin using modest in-house or public cloud IT resources, most are destined for eventual production.
Planning for production computing, storage, and software needs, and designing production workflows early in the development
process minimizes mid-stream infrastructure and procedure changes and accelerates return on investment.

Planning for production is particularly important with storage. Ideally, an AI storage infrastructure should provide:

• Non-disruptive expansion to meet rapidly growing data and changing I/O needs.

• Seamless data sharing among developers, training jobs, and production.

• Decade-long “24 365” duty cycles.

• Security to protect intellectual property from intrusion and theft.

Uncomplicate Data Storage, Forever 3

BUSINESS WHITE PAPER

How AI Projects Utilize IT

From an information technology perspective, AI projects can be thought of as consisting of two general types of tasks:

Data Curation
The best models are trained using input data from multiple disparate sources—event records, documents, images, sensor
readings, etc. Data curation is a blanket term for the acquisition, storage, and pre-processing of data for use in model training, and
later in production.

Projects usually preserve raw input data to avoid the time and cost of recreating or re-acquiring it. In most cases, it must be
curated (preprocessed) for model training. Data must be anonymized; timelines, measurement units, graphics resolutions, and so
forth must be reconciled; and items must be transformed into the file or object formats that training and production tools require.
Most projects preserve both raw and curated data.

What Data Curation Means for Storage Infrastructure

A large project might use petabytes of curated training data. Its storage infrastructure must be able
to scale from a few hundred gigabytes of files to petabytes contained in billions of files
and/or objects.

Raw data is used as input to curation but is then largely idle until a model is reimagined. Storage
used for it should be low-cost, but highly scalable.

Data is typically curated by processing large batches of raw data sequentially, but items are
accessed randomly during training and production. Flexible systems that perform well with both
sequential, random, and mixed access I/O workloads are key.

Training and Production

The fundamental assumption of AI is that iterative training with large amounts of input data with known outcomes converges on
a model that reliably makes useful inferences when presented with input1 for which outcomes are unknown. In a medical imaging
scenario for example, thousands of images with known diagnoses might be used to train a model that, when presented with new
images, would propose diagnoses used to advise physicians. Subject matter-specific applications often combine application-
specific data sets with readily available general-purpose large language models (LLMs) and use retrieval-augmented generation
(RAG) to satisfy natural language queries in their subject matter area.

As training progresses, the amount of curated data and the intensity with which it is used increase rapidly. In addition, the data,
files, and other collateral that comprise the evolving model become increasingly valuable.

When a completed model moves into production, its I/O needs change from the very intensive demands of a relatively small
number of training jobs to the agility required to service thousands of concurrent client transactions making many unrelated
I/O requests.

Uncomplicate Data Storage, Forever 4

BUSINESS WHITE PAPER

Finally, it is common to preserve production inputs and the corresponding output inferences for retraining and other techniques
that adapt models to changing conditions.

What Training and Production Mean for Storage Infrastructure

Storage for curated data must be scalable, both in capacity and performance, and must be easily
sharable by many concurrent training jobs. As the number of concurrent training jobs grows,
automated scheduling and data sharing are musts.

The business value of a production model requires storage with production-grade reliability,
performance, and administrative simplicity. The value of the intellectual capital embodied in it
and its training and production data requires robust security for “data at rest.”

In Summary: The Ideal Storage for an AI Infrastructure

Perhaps the most important attribute for AI project storage is agility—the ability to grow from a few hundred gigabytes to
petabytes, to perform well with rapidly changing mixed workloads, to serve data to training and production clients simultaneously
throughout a project’s life, and to support the data models used by project tools. The attributes of an ideal AI storage solution are:

Performance Agility Space and Power Efficiency

• I/O performance that scales with capacity. • Low space and power requirements that free data center
resources for power-hungry computation.
• Rapid manipulation of billions of items, e.g.,
for randomization during training. Data Models
Capacity Flexibility • Support for block, file, and object data models and common
network protocols.
• Wide range (100s of gigabytes to petabytes) with easy, non-
disruptive expansion. Security
• High performance with billions of data items. • Strong administrative authentication.

• Range of cost points optimized for active and seldom- • “Data at rest” encryption.
accessed data.
• Protection against malware (especially ransomware) attacks.
Availability & Data Durability
OperationAL Simplicity
• Continuous operation over decade-long project lifetimes.
• Non-disruptive modernization for continuous
• Protection of data against loss due to hardware, software, long-term productivity.
and operational faults.
• Support for AI projects’ most-used interconnects
• Non-disruptive hardware and software upgrade and protocols.
and replacement.
• Autonomous configuration (e.g. device groups,
• Seamless data sharing by development, training, data placement, protection, etc.).
and production.
• Self-tuning to adjust to rapidly changing mixed random/
sequential I/O loads.

Appendix A lists potential pitfalls to be avoided when designing an AI storage infrastructure.

Uncomplicate Data Storage, Forever 5

BUSINESS WHITE PAPER

Pure’s AI Product Portfolio

Storage Systems
The Pure Storage® portfolio of all-flash storage systems,
illustrated in Figure 2, includes three FlashArray™ scale-up
Unified Block and File (UBF) servers, a FlashBlade® scale-out
Unified Fast File and Object (UFFO) server, and two Unified
Data Repository (UDR) servers, one based on FlashArray and
the other on FlashBlade, for low-cost large-scale storage.
All systems support broad ranges of capacity that is easily
expandable online. Each is optimized for specific capacity
ranges, data type(s), and cost/performance targets. With
these servers, Pure Storage can satisfy virtually any AI
storage requirement from project conception through model
training, and on into production. FIGURE 2 Pure’s AI Storage System Portfolio

Features Common to All Systems

All Pure Storage systems share key properties:

Reliability and Availability Continuity and Longevity

Systems are designed to continue operating when any internal Systems are designed for lifetimes of a decade or more of
component fails. For example, they survive at least two (in most continuous operation with no planned downtime or service
cases more) rare DirectFlash® Module2 (DFM) failures that outages, even during software and hardware upgrades
overlap in time without loss of data or client access to it. and modernizations.

Efficiency Evergreen®

Systems optimize capacity utilization by thin provisioning— Evergreen subscriptions that include regular software and
deferring space allocation until clients write data. They allocate hardware updates and periodic modernizations are perhaps
space autonomously to balance utilization and load. DFMs’ high Pure’s most important benefit for AI projects of long duration.
density minimizes system “footprint,” power consumption, and
3
The company offers subscriptions both for purchased systems
ultimately, e-waste. and for storage delivered as a managed service (Evergreen//
One™). Technical briefs TB-230601f and TB-230601o, available
Simplicity at https://support.purestorage.com or from a Pure Storage

Systems minimize administrative tasks to the greatest extent representative, describe the company’s Evergreen offerings in

possible—there are no “device groups” to manage, and no more detail.

data placement or protection decisions to make. Systems

report status and events to the Pure1® Cloud frequently; Pure1
analyzes behavior and proactively initiates any necessary
service operations.

Uncomplicate Data Storage, Forever 6

BUSINESS WHITE PAPER

Product-Specific Features

Spectrum of Performance and Cost Options

From latency-optimized FlashArray//XL™ for rapid response in production to throughput-optimized FlashBlade//S for training with
very large data sets, to cost-optimized FlashArray//E™ and FlashBlade//E™ for less-active data (e.g., raw data, feature stores, etc.),
Pure’s product line offers cost/performance options that span AI project requirements.

Pure systems’ very high-performing metadata operations on large numbers of files and objects make them particularly suitable for
AI project training.

Capacity Flexibility

With maximum capacities ranging from FlashArray//C50’s 1.6PB (effective4) to FlashBlade//S’s nearly 20PB (physical), Pure’s
systems can support many in-house AI projects from concept through production with a single project-wide data hub. Systems
can “start small” in a single chassis with minimal physical capacity and be expanded up to a model’s maximum supported capacity
without interrupting service to applications.

Where multiple systems are required, for capacity, performance, cost, or data model reasons, Pure1 centralizes storage
management and supports AI-based “what if” capacity planning for users’ entire “fleets” of Pure systems.

Data Reduction

Pure’s systems optimize flash utilization by compressing data prior to storing it. In addition, FlashArray systems, achieve further
efficiency by deduplicating blocks of data, replacing duplicate blocks with links to a single stored instance. Deduplication works
well with structured data (e.g., databases, tables, etc.).

NVIDIA Collaboration

With approximately 80% market share,5 NVIDIA Corporation is the

acknowledged leader in AI computation. Since 2017, Pure Storage has
collaborated with NVIDIA to develop solutions for AI. The collaboration
has resulted in the jointly-developed AIRI® Pure Storage NVIDIA DGX
BasePOD Reference Architecture6 for AI, based on NVIDIA’s DGX
servers and network fabric coupled with Pure’s FlashBlade//S™ storage
systems. As an NVIDIA BasePOD certified reference architecture,
AIRI eliminates the design, deployment, and management complexity
inherent in custom-crafted AI infrastructures. FIGURE 3 Pure-NVIDIA Collaboration

More recently, the Pure Storage-NVIDIA collaboration has resulted in:

• Pure’s implementation of the NVIDIA GPUDirect7 storage protocol to transfer data directly between FlashBlade//S storage and
NVIDIA’s GPUs, bypassing control CPU memory.

• Storage partner validation for FlashBlade//S in NVIDIA-certified OVX L40S reference architectures offered by major server
vendors. When combined with FlashBlade//S storage, OVX-certified servers are complete AI platforms that accelerate small
model training and fine-tuning, as well as GenAI RAG and production inference workloads.

Finally, when used in conjunction with Portworx®, NVIDIA’s device plugin for Kubernetes8 provides comprehensive management
and scheduling of both GPU and storage resources at all AI project stages.

Uncomplicate Data Storage, Forever 7

BUSINESS WHITE PAPER

Portworx: Pure’s Secret Weapon

Portworx by Pure Storage is the company’s Kubernetes data services platform that provides persistent storage, data sharing and
protection, workflow automation, and (optional) disaster recovery for containerized applications.

Portworx accelerates development of IT environments for the containerized applications used in most AI projects with a software-
defined storage model that enables infrastructure-neutral access to data. Portworx supports any type of block storage, whether
located on-premises or in a public or private cloud.

Portworx presents standardized virtual block or

file storage devices to applications, regardless
of the on-premises or cloud technology used to
instantiate it. It does this by making architect-
defined storage classes available to developers.
Storage classes standardize storage properties,
simplifying self-service job creation and promoting
stable, reliable development and production
environments. In addition, Portworx includes
templates that assist with setup for applications
like Apache Kafka, Zookeeper, Elasticsearch, as
well as for popular databases including SQL Server,
MongoDB, Postgres, and Cassandra, all of which are
commonly used in AI projects. It provides consistent
development environments while enabling
self-service job creation by data scientists and
other developers.

The software-defined storage model enables

Portworx to share data among multiple Kubernetes FIGURE 4 Using Portworx with a Database App
pods running separate jobs. This makes it particularly
useful for model training, where running many concurrent jobs that share the same input data is key to rapid implementation. As
an example, Figure 4 shows how Portworx can simplify deployment of a training application that utilizes data from a database.

Finally, Portworx provides fault tolerance by replicating its virtual storage devices to either on-premises resources or to a public
cloud. It can also protect entire project environments by copying them to S3 objects in a public or on-premises private cloud.

Uncomplicate Data Storage, Forever 8

BUSINESS WHITE PAPER

Table 1 suggests ways in which Portworx can be used to advantage in in-house AI projects.

Portworx Feature Application to AI

Up to 640PB per cluster Adequate storage capacity for any size AI project.

Software-defined Presents on-premises, private cloud, and public cloud storage to applications as
(“cloud-native”) storage feature-consistent block volumes or NFS file systems.

Public and hybrid cloud storage Supports mix of AWS or Microsoft Azure and on-premises storage for a Kubernetes
cluster.
Data Sharing & Resource / Workflow Management

Dynamic provisioning Enables convenient self-provisioning of virtual storage resources for greater data
scientist and developer agility.

Easy data migration Minimizes public cloud access charges by copying data to on-premises storage for
use by applications in multiple clusters.
Portworx

Vector database support Facilitates Retrieval Augmented Generation (RAG) for training customized GenAI
models managed by Kubernetes.

Python libraries for Simplifies common operations in data scientists’ Jupyter notebooks for self-service
Portworx services resource instantiation and management.

Disaggregated architecture Enables independent scaling of computing and storage eliminates need to provision
unnecessary resources.

Storage class, resource, and Standardizes Kubernetes resources for consistency while eliminating needs for
application templates developer awareness of implementation details.

Data replication and backup Transportable snapshots and backup of Kubernetes environments protect data
protection transparently to developers.

Application checkpoints Enables easy-to-use preservation of progress in long-running jobs.

TABLE 1 Portworx Features That Optimize AI Projects

While Portworx is designed to utilize any storage platform, combining it with Pure’s systems creates
a reliable, scalable, high-performing storage, data protection, and resource provisioning and
workflow management environment for containerized AI projects.

Uncomplicate Data Storage, Forever 9

BUSINESS WHITE PAPER

At the End of the Day...

The Pure Storage system portfolio includes storage for all phases of AI projects, large and small. Pure’s systems relieve IT, data
scientists and MLOps teams from most common storage management tasks. With them, data scientists can concentrate on
modeling and MLOps teams can provide reliable, scalable, high-performing environments for project data with sharable storage
that expands to meet both training and production needs without disruption.

Available in performance-optimized and capacity-optimized models that scale on demand, The Pure Storage platform accelerates
and enhances AI projects for healthcare, genomics, exploration, financial and many other fields. It can do so while sharing storage
capacity and I/O bandwidth with other data-intensive applications such as analytics, database backup and restore, software
development, media and entertainment post-production, electronic design automation (EDA) and more. .

Portworx takes the guesswork out of creating robust, scalable Kubernetes environments for containerized AI training and
production applications. Its templates simplify configuring and implementing applications and databases commonly used in AI
projects; its built-in storage services enable data sharing and backup of both project data and entire project environments.

For Further information…

Additional information on topics discussed in this brief is available at:

Pure Storage in Artificial Intelligence

https://www.purestorage.com/solutions/analytics-and-ai/artificial-intelligence.html

Overview of Pure Storage Portworx

https://portworx.com/

Uncomplicate Data Storage, Forever 10

BUSINESS WHITE PAPER

Appendix A:
AI Storage Pitfalls
Data storage can have a profound impact on in-house AI project cost, development speed, and ultimately, project success.
This appendix lists pitfalls to be avoided when planning and developing project storage infrastructures.

Configuration Rigidity Inflexibility

It is virtually impossible to predict lifetime storage needs I/O requirements vary throughout the life of an AI project.
accurately at the start of an AI project. Storage systems that Storage that is highly optimized for narrow use cases (e.g.,
are awkward or impossible to reconfigure, expand, or upgrade by data set location, block sizes, file types, etc.) can make it
should therefore be avoided. Even upgradable systems difficult to respond as a project evolves. Manual storage “tuning”
can be problematic if upgrading entails service outages, interrupts development and production and requires expensive
especially in the production phase when inferences are driving expertise. And even experts don’t always get it right.
business decisions.
Data Model Rigidity
Fragility At the outset, AI projects typically utilize files and/or structured
If an AI project is important enough to invest in, it’s important databases, hosted either on block storage or in file servers.
enough to keep available. Data scientist productivity is As models develop and training data scales, many tools use
important, but as projects move to intensive training and data in object form to simplify organization and processing and
production, availability is vital. Storage that can’t survive construct vector databases that maximize the quality of search
component failures, be repaired or upgraded online, or recover results. Storage systems that support only a single data model
from user and administrative errors isn’t up to the AI job. (e.g., file servers, block storage systems) may make converting
input data into forms usable by training jobs and in production
Data Isolation awkward and time-consuming.
Data that can’t be easily shared among developers and between
containerized training apps and production tasks must be copied
from where it is to where it’s needed. As data sets expand,
copying large data sets becomes disruptive. In addition to being
time and resource-consuming, data set copying initiated by
humans can be error-prone.

Uncomplicate Data Storage, Forever 11

BUSINESS WHITE PAPER

Appendix B:
Using Pure’s Products for AI
The Pure Storage portfolio of all-flash systems meets AI capacity, I/O performance, and orchestration needs from project
conception through model development, on into production. Table 2 suggests the optimal applications for each of Pure’s systems
in AI projects.

Product Training Production Infrastructure

• Vector databases
very large objects I/O-intensive production
UFFO

0.35PB—19.6PB • Retrieval Augmented

(images, video,…) (e.g., GenAI)
(physical) Generation (RAG)

• Ecosystem streaming
(Kafka, etc.)
large data sets
0.96PB—5.5PB (effective) • Kubernetes volumes
& VMs

Latency-sensitive
production • Vector databases

• Retrieval Augmented
UBF

medium-large
Generation (RAG)
0.31PB—3.3PB (effective) data sets
• Kubernetes volumes
& VMs

• Feature stores

• Inference
1.6PB—8.9PB (effective)
input & output

1.0PB—4.0PB (effective) Raw training data Feature stores

Exception logging
UDR

seldom-accessed after Inference

Project archives
initial use input & output

4.0PB—8.0PB (physical)

TABLE 2 Pure Storage Systems in AI Projects

Uncomplicate Data Storage, Forever 12

BUSINESS WHITE PAPER

Storage for Every AI Project

As Table 2 suggests, Pure’s systems offer a breadth of capacity, cost/performance, and protocol support that are a fit for all
aspects of an AI project. In terms of the desirable AI storage properties listed on page 5:

Performance Agility Availability & Durability

The portfolio includes a range of cost/performance options to Pure’s systems provide reliable “24x7” storage for AI model
meet lifetime project needs: training and production:

• All systems utilize reliable, high-performing, power- • Systems provide continuous availability over lifetimes that
efficient flash-based DFMs exclusively. exceed a decade.

• Latency-optimized UBF systems support quality-of- • Systems are highly available—internal redundancy
service (QoS) limits for prioritizing performance among allows them to continue operating if components fail.
data sets. Purity software is designed to sustain a minimum of two
concurrent DFM failures (in many cases, more) without
• Throughput-optimized UFFO systems scale out—adding
loss of data availability.
blades to a system increases both storage capacity,
network performance, and processing capability. • All updates and modernizations are non-disruptive,
performed while systems are serving client I/O requests
• Cost-optimized UDR systems make retaining large
and with minimal performance impact.
amounts of infrequently-accessed data affordable.
• DFM service lifetimes are over 10 years—far longer than
• Rapid File Toolkit for UFFO systems accelerates bulk
typical SSD specifications. Data durability (i.e., statistical
operations on millions of files.
mean time to data loss, or MTTDL) due to DFM failure is on

Capacity Flexibility the order of millions of years.

All Pure systems are easily expandable from minimum to

Space and Power Efficiency
maximum capacity by non-disruptive addition of DFMs (and
Pure’s high-density all-flash system architectures:
blades in scale-out UFFO systems). AI project architects can
specify systems based on production expectations with initial • Hold up to 2 petabytes of physical storage in five

configurations sized to meet early-stage capacity needs and rack units.

add capacity incrementally as projects progress: • Consume up to 90% less power per terabyte than
competitive storage systems.

• Result in much less “e-waste” due to longer lifetimes

of devices and storage systems.

Uncomplicate Data Storage, Forever 13

BUSINESS WHITE PAPER

Data Models Operational Simplicity

AI development and production tools utilize block storage, As AI projects evolve from concept through training and
file servers, or object stores. Pure’s systems support block, into production, a smooth operating environment becomes
file, and object protocols for easy integration: increasingly important. Pure storage systems’ ease of operation

• UBF systems support client-side file systems and object can simplify a project’s operating environment in several ways:

stores with virtual block devices as well as providing file • They are typically installed and ready for use in a few
services for hundreds of file systems containing as many hours rather than days.
as a half-billion files.
• With support for all common I/O protocols (NVMe, NFS,
• UFFO systems support petabyte-scale data sets SMB, S3, iSCSI, GPUDirect), they integrate easily into data
containing billions of files or S3 objects in thousands of file centers. UBF systems can provide block and file services
systems or object buckets. simultaneously; UFFO systems can simultaneously serve

• UDR systems store less-frequently accessed data, files and objects.

whether blocks, files, or objects, economically. • They eliminate many common management operations—
specifying protection, placing data sets, reserving spare
Security capacity, tuning for performance, etc. are all autonomous.

Many AI projects utilize proprietary data. As models mature, • Both software/hardware upgrades and system
both curated training data and the intellectual property modernizations are non-disruptive (i.e., are performed
in models’ digital representations become increasingly while systems are operating) and have negligible impact
valuable. Pure’s systems provide a secure environment for on performance. Data migration is never required.
digital information: • They are self-monitoring; they upload status and
• They use strong administrator authentication to event logs9 every few seconds to the Pure1 monitoring
control access. and consolidated management service (available to
all customers with active Evergreen subscriptions).
• They automatically encrypt all stored data and metadata.
Pure1’s AI-based behavioral models identify and report
• They can be configured to take periodic snapshots potential issues and provide experience-based capacity
of selected data to protect against application and planning advice.
administrative errors.

• Users can enable SafeMode™ to protect against malware

(e.g., ransomware).

Accelerate AI Adoption With The Pure Storage Platform

1 For GenAI, “input data” usually takes the form of natural language queries.
cf: https://www.techtarget.com/searchdatamanagement/feature/Vector-search-now-a-critical-component-of-GenAI-development
“LLMs…are trained with extensive vocabularies and can determine the meaning of a question even if it isn't phrased in the business-specific language…”
2 DFMs are Pure Storage-designed flash memory modules that provide the persistent storage in all Pure systems. They fulfill the role played by SSDs in conventional flash-based storage systems.
3 Up to 75 terabytes per module at publication time.
4 Effective capacity is that available for user data, net of erasure code overhead and system metadata. Users typically experience up to 2:1 data reduction with FlashBlade compression and as much as 5:1
with FlashArray deduplication and compression. Data reduction depends strongly on the nature of data. For example, structured text and tabular data usually reduces by at least 2:1, whereas images,
streams, and encrypted data are essentially uncompressible.
5 https://www.britannica.com/topic/NVIDIA-Corporation
6 https://www.purestorage.com/docs.html?item=/type/pdf/subtype/doc/path/content/dam/pdf/en/reference-architectures/ra-airi-nvidia-dgx-basepod-architecture-config-guide.pdf
7 GPUDirect is a registered trademark of NVIDIA Corporation.
8 https://catalog.ngc.nvidia.com/orgs/nvidia/containers/k8s-device-plugin
9 Except where customer policies or regulations prohibit external connections.

purestorage.com 800.379.PURE

©2024 Pure Storage, the Pure Storage P Logo, AIRI, DirectFlash, Evergreen, Evergreen//One, FlashArray, FlashArray//E, FlashArray//XL,
PS2571-02-en 05/24
FlashBlade, FlashBlade//E, FlashBlade//S, Portworx, Pure1, SafeMode, and the marks in the Pure Storage Trademark List are trademarks
Uncomplicate Data Storage, Forever
or registered trademarks of Pure Storage Inc. in the U.S. and/or other countries. The Trademark List can be found at purestorage.com/
14
trademarks. Other names may be trademarks of their respective owners.

Inside AI Maturity Model. Five Steps To Transform With by Luhui Hu Towards Data Science
No ratings yet
Inside AI Maturity Model. Five Steps To Transform With by Luhui Hu Towards Data Science
1 page
Data Infrastructure Ai Success Ebook
No ratings yet
Data Infrastructure Ai Success Ebook
17 pages
Ai For Everyone Presentation 135 Slides!
No ratings yet
Ai For Everyone Presentation 135 Slides!
135 pages
Where Is AI Heading - Nokia
No ratings yet
Where Is AI Heading - Nokia
1 page
AI For Everyone Presentation
No ratings yet
AI For Everyone Presentation
135 pages
Introduction To AI & ML - Technology & Data
No ratings yet
Introduction To AI & ML - Technology & Data
14 pages
Getting Your Data Ready For Ai Oreilly Ebook 87023487USEN
No ratings yet
Getting Your Data Ready For Ai Oreilly Ebook 87023487USEN
25 pages
Machine Learning Lessons
No ratings yet
Machine Learning Lessons
44 pages
AI For Everyone PDF
No ratings yet
AI For Everyone PDF
62 pages
Ai Notes Class Ix 2025-26
No ratings yet
Ai Notes Class Ix 2025-26
22 pages
1 DGX EPYC 5 Steps To Get Started Ebook DGX-a100-Partner
No ratings yet
1 DGX EPYC 5 Steps To Get Started Ebook DGX-a100-Partner
8 pages
BT-IBM AI Collaboration Insights 2022-2024
No ratings yet
BT-IBM AI Collaboration Insights 2022-2024
30 pages
AI Lifecycle: DataOps and MLOps Explained
No ratings yet
AI Lifecycle: DataOps and MLOps Explained
34 pages
CFO's Guide to AI and Machine Learning
No ratings yet
CFO's Guide to AI and Machine Learning
13 pages
AI Primer VF2
No ratings yet
AI Primer VF2
17 pages
Session17and18 SPM
No ratings yet
Session17and18 SPM
59 pages
Ai - Project Cycle, Ai Bias &bag of Words
No ratings yet
Ai - Project Cycle, Ai Bias &bag of Words
10 pages
Artificial Intelligen Ce
No ratings yet
Artificial Intelligen Ce
51 pages
Artificial Intelligence For It Operations Aiops Nordics
No ratings yet
Artificial Intelligence For It Operations Aiops Nordics
33 pages
Businesses Still Face The AI Data Challenge
No ratings yet
Businesses Still Face The AI Data Challenge
11 pages
Intel Artificial Intelligence Eguide
No ratings yet
Intel Artificial Intelligence Eguide
15 pages
IDC White Paper
No ratings yet
IDC White Paper
20 pages
Scaling AI with Data Integration
No ratings yet
Scaling AI with Data Integration
9 pages
AI For Everyone Notes
No ratings yet
AI For Everyone Notes
6 pages
Subtitle
No ratings yet
Subtitle
4 pages
CFOs' Guide to AI and Machine Learning
No ratings yet
CFOs' Guide to AI and Machine Learning
13 pages
AI and Big Data 1
No ratings yet
AI and Big Data 1
5 pages
01 AI Trends Report 2023
No ratings yet
01 AI Trends Report 2023
15 pages
AI Implementation for Business Leaders
No ratings yet
AI Implementation for Business Leaders
12 pages
Integrate Artificial Intelligence Into Electronic Devices: Description of AI Concepts
No ratings yet
Integrate Artificial Intelligence Into Electronic Devices: Description of AI Concepts
18 pages
Report 5
No ratings yet
Report 5
7 pages
10 - Artificial Intelligence - Final Notes
No ratings yet
10 - Artificial Intelligence - Final Notes
18 pages
AIH Module 1-2
No ratings yet
AIH Module 1-2
43 pages
There Isnoai Without Data
No ratings yet
There Isnoai Without Data
11 pages
The Data Leader's Guide To AI-ready Data
No ratings yet
The Data Leader's Guide To AI-ready Data
21 pages
AI Use Cases For Business Leaders Realize Value With AI
No ratings yet
AI Use Cases For Business Leaders Realize Value With AI
14 pages
AI Use Cases Across Industries
No ratings yet
AI Use Cases Across Industries
6 pages
Ai Machine Learning Infrastructure White Paper
No ratings yet
Ai Machine Learning Infrastructure White Paper
8 pages
Lec 7
No ratings yet
Lec 7
18 pages
Industrial AI Platform
No ratings yet
Industrial AI Platform
36 pages
2023KS Sumrani-Fuzzy Classification Techniques in AI
No ratings yet
2023KS Sumrani-Fuzzy Classification Techniques in AI
31 pages
The Rise of AI Data Infrastructure
No ratings yet
The Rise of AI Data Infrastructure
14 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
6 pages
AI Model Lifecycle Guide
No ratings yet
AI Model Lifecycle Guide
6 pages
Main
No ratings yet
Main
15 pages
AI Facilitators Handbook X Removed
No ratings yet
AI Facilitators Handbook X Removed
53 pages
AI Practitioner Handbook 20230324
No ratings yet
AI Practitioner Handbook 20230324
94 pages
Slides Ai Governance Unveiled Leveraging Ai Effectively Responsibly
100% (1)
Slides Ai Governance Unveiled Leveraging Ai Effectively Responsibly
22 pages
2024 Cognite Atlas Ai Definitive Guide To Industrial Agents
No ratings yet
2024 Cognite Atlas Ai Definitive Guide To Industrial Agents
83 pages
Generative AI Use Cases for Enterprise Data
No ratings yet
Generative AI Use Cases for Enterprise Data
10 pages
Artificial Intelligence For Data Driven Disruption White Paper 3328en
No ratings yet
Artificial Intelligence For Data Driven Disruption White Paper 3328en
36 pages
Ai (Og)
No ratings yet
Ai (Og)
14 pages
LfmbtlGVQyaCZXCAP8gg - IDC - Why Developing and Deploying AI Technology On Workstations Makes Sense
No ratings yet
LfmbtlGVQyaCZXCAP8gg - IDC - Why Developing and Deploying AI Technology On Workstations Makes Sense
12 pages
Data Ready Ai
No ratings yet
Data Ready Ai
8 pages
Generative Ai Handbook
No ratings yet
Generative Ai Handbook
1 page
Antim Prahar 2024 AI and ML For Business
No ratings yet
Antim Prahar 2024 AI and ML For Business
43 pages
Mod 3 AI and Cloud Computing
No ratings yet
Mod 3 AI and Cloud Computing
25 pages
Managing Ai Projects
No ratings yet
Managing Ai Projects
16 pages
WP Greener Data Centers With Sustainable Flashstack Software
No ratings yet
WP Greener Data Centers With Sustainable Flashstack Software
11 pages
TWP FlashArray C Data Protection Admin Best Friend
No ratings yet
TWP FlashArray C Data Protection Admin Best Friend
8 pages
WP Best Practices Commvault Flashblade
No ratings yet
WP Best Practices Commvault Flashblade
59 pages
Ds Pure Storage Evergreen Orange Site Offering
No ratings yet
Ds Pure Storage Evergreen Orange Site Offering
2 pages
WP The Economic Benefits of Modernizing With Pure
No ratings yet
WP The Economic Benefits of Modernizing With Pure
9 pages
WP Consolidating SQL Server Workloads On Pure Storage Flasharray
No ratings yet
WP Consolidating SQL Server Workloads On Pure Storage Flasharray
13 pages
TWP Flasharray File Services Veritas Flashblade
No ratings yet
TWP Flasharray File Services Veritas Flashblade
12 pages
WP Be Prepared For Next Gen of Shoppers DXC Pure
No ratings yet
WP Be Prepared For Next Gen of Shoppers DXC Pure
8 pages
VMware Cloud Director - Lightning Lab - HOL-2545-02-VCF-L
No ratings yet
VMware Cloud Director - Lightning Lab - HOL-2545-02-VCF-L
55 pages
TWP Portworx Data Services Equinix Metal
No ratings yet
TWP Portworx Data Services Equinix Metal
13 pages
Whats New in VMware Cloud Foundation 9.0 - Automation (HOL-2610-02-VCF-L)
No ratings yet
Whats New in VMware Cloud Foundation 9.0 - Automation (HOL-2610-02-VCF-L)
58 pages
VCF Operations For Networks - Getting More Out of It - HOL-2502-01-VCF-L
No ratings yet
VCF Operations For Networks - Getting More Out of It - HOL-2502-01-VCF-L
33 pages
Introduction To VSphere Performance HOL 2531 01 VCF L
No ratings yet
Introduction To VSphere Performance HOL 2531 01 VCF L
103 pages
WP The All Flash Data Center Is Imminent
No ratings yet
WP The All Flash Data Center Is Imminent
2 pages
3d Book of Prism Microservices Infrastructure
No ratings yet
3d Book of Prism Microservices Infrastructure
3 pages
VMware Aria Operations - Getting Started and Advanced Topics - HOL-2534-01-VCF-L
No ratings yet
VMware Aria Operations - Getting Started and Advanced Topics - HOL-2534-01-VCF-L
29 pages
M01 - Introduction To VMC On AWS
No ratings yet
M01 - Introduction To VMC On AWS
49 pages
21b VM Migration Arch
No ratings yet
21b VM Migration Arch
3 pages
Whats New in VCF 5.2 - Hol-2504-01-Vcf-s
No ratings yet
Whats New in VCF 5.2 - Hol-2504-01-Vcf-s
21 pages
Enablement Guide - SRM 6.5
No ratings yet
Enablement Guide - SRM 6.5
6 pages
WP Enterprise Data Cloud Architecture
No ratings yet
WP Enterprise Data Cloud Architecture
8 pages
M04 - Interconnectivity With On-Premises
No ratings yet
M04 - Interconnectivity With On-Premises
43 pages
WP Top 5 Ways Enterprise Data Cloud Simplifies Operations
No ratings yet
WP Top 5 Ways Enterprise Data Cloud Simplifies Operations
9 pages
Vsphere Replication 65 User
No ratings yet
Vsphere Replication 65 User
59 pages
VMW Tech Book NSX T Data Center 103
No ratings yet
VMW Tech Book NSX T Data Center 103
38 pages
M08 Disaster Recovery As A Service
No ratings yet
M08 Disaster Recovery As A Service
59 pages
M13 - Workshop Wrap-Up
No ratings yet
M13 - Workshop Wrap-Up
9 pages
12c Book of Network Services Flow Virtual Networking
No ratings yet
12c Book of Network Services Flow Virtual Networking
12 pages
M04 - Interconnectivity With On-Premises
No ratings yet
M04 - Interconnectivity With On-Premises
43 pages
M12 - VMware Cloud On AWS - Connectitivy & Security Design Options
No ratings yet
M12 - VMware Cloud On AWS - Connectitivy & Security Design Options
42 pages
ENTREPRENEURSHIP
No ratings yet
ENTREPRENEURSHIP
2 pages
Dechow Dichev TAR 2002-1
No ratings yet
Dechow Dichev TAR 2002-1
25 pages
Union Properties Case Study
No ratings yet
Union Properties Case Study
2 pages
Module 2 - 2A
No ratings yet
Module 2 - 2A
2 pages
TNS Gen 7 v1 Contents Log
No ratings yet
TNS Gen 7 v1 Contents Log
2 pages
Lectures in International Marketing 2019
No ratings yet
Lectures in International Marketing 2019
61 pages
Bamboo Pavement Case Study
No ratings yet
Bamboo Pavement Case Study
1 page
Photoelectric Effect in Quantum Physics
No ratings yet
Photoelectric Effect in Quantum Physics
13 pages
Full Body Dumbbell Workout
No ratings yet
Full Body Dumbbell Workout
13 pages
Antibiotics: Success and Failures
No ratings yet
Antibiotics: Success and Failures
43 pages
H3AC3 English Installation Guide
No ratings yet
H3AC3 English Installation Guide
1 page
EBF PPT Part 1 Unit 2 Financial Systems NEU 2022
No ratings yet
EBF PPT Part 1 Unit 2 Financial Systems NEU 2022
73 pages
Economics: Number Key Number Key
No ratings yet
Economics: Number Key Number Key
30 pages
LMCC Impact Study - Lindsay Johnson
No ratings yet
LMCC Impact Study - Lindsay Johnson
5 pages
Database Lab: Attributes & Queries
No ratings yet
Database Lab: Attributes & Queries
3 pages
USCIS Quito Interview Notice for I-730
No ratings yet
USCIS Quito Interview Notice for I-730
4 pages
Maximum Price
No ratings yet
Maximum Price
3 pages
40 (3ph) KVA-WC
No ratings yet
40 (3ph) KVA-WC
5 pages
01a. Questionnaire Hf. Recurrent Rev. 01, Jan. 04, 2023-Lgtc-tt-Am-f004
No ratings yet
01a. Questionnaire Hf. Recurrent Rev. 01, Jan. 04, 2023-Lgtc-tt-Am-f004
4 pages
Understanding Plant Generation Baselines
No ratings yet
Understanding Plant Generation Baselines
4 pages
ACCOMPLISHMENT REPORT 2024 NWMC School Memorandum Sample
100% (1)
ACCOMPLISHMENT REPORT 2024 NWMC School Memorandum Sample
4 pages
QP For Cooling Fan
No ratings yet
QP For Cooling Fan
1 page
English Grade 09 Worksheet 2
No ratings yet
English Grade 09 Worksheet 2
3 pages
Ultratech Cement: Particulars Test Results Requirements of
No ratings yet
Ultratech Cement: Particulars Test Results Requirements of
1 page
Roblox Skins - Google Search
No ratings yet
Roblox Skins - Google Search
1 page
President CEO Consumer Products in Phoenix AZ Resume Mindi Osborn
No ratings yet
President CEO Consumer Products in Phoenix AZ Resume Mindi Osborn
2 pages
Tables in SAP
No ratings yet
Tables in SAP
20 pages
Predicting Heart Disease at Early Stages Using Machine Learning: A Survey
No ratings yet
Predicting Heart Disease at Early Stages Using Machine Learning: A Survey
4 pages
Engineering Economy Practice Problems
No ratings yet
Engineering Economy Practice Problems
4 pages
Build A Bear
100% (2)
Build A Bear
10 pages