Stack Color

The document presents a polynomial-time algorithm for compile-time memory allocation in structured programming languages, focusing on the Compile-Time Memory Allocation (CMA) problem. It introduces an O(n log n) approximation algorithm for off-line Dynamic Storage Allocation (DSA) that improves previous methods, emphasizing the importance of memory optimization in complex embedded systems. The results demonstrate significant advancements in handling memory allocation efficiently while maintaining performance guarantees.

Uploaded by

jiangax09

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views2 pages

Stack Color

Uploaded by

jiangax09

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

5907

Algorithms for Compile-Time Memory Optimization

Jordan Gergov”
Max-Planck Institute for Computer Science

Abstract AZlocation problem (CMA) is to compute a mapping

Given a program P in a structured programming language, from objects to memory regions such that: (1’) the
we propose the first polynomial-time algorithm for compile- size of the memory region that is allocated to an object
time memory allocation for the source objects of P (eg ar- equals the memory requirement of that object; (2’) if
rays, C structures) with a performance guarantee. Further, two memory regions overlap, then the associated object
we present a new and simple O(n log n) time bapproxima- pair is in the set of object pairs that are allowed by
tion algorithm for (off-line) Dynamic Storage Allocation and, (2) to share memory; and (3’) the memory usage is
thus, improve the best previous approximation ratio of 5. minimized. The trend towards more complex embedded
systems increases the interest in memory optimization
1 Problem Motivation and Definition tools, and CMA algorithms are also potentially useful in
optimization of software with a high memory demand.
Consider the following simple fragment of a C function:
The performance of all previous heuristics for this NP-
foo(int i) { char A[40]; intB[lO];if(i){...}else{...
hard task can deviate significantly from optimality, cf
} }. The dots in the i !=O (resp. i==O) branch of the if
[l]. In Section 3, we discuss a novel approach to
statement denote a C block without read or write access
fast CMA algorithms with a performance guarantee.
to the array B (resp. array A), ie either only A or only
Fabri [l] gives a more detailed treatment of the CMA
B is needed at run time. Assume that an int (char) is
definition as well as an account of previous research.
represented by 4 (1) bytes. Now, let us slightly simplify
One special case of CMA with an independent
the compile-time memory allocation task and ask: how
research history as well as with numerous applications is
much storage does the compiler need to allocate for the
the off-line Dynamic Storage Allocation (DSA). Assume
arrays A and B ? First, the compiler can use two non-
that we are given a straight-line program, ie its source
overlapping storage areas for A and B, ie 40 + 10.4 = 80
code does not contain branch or loop statements and, for
bytes. Alternatively, the compiler could find out that A
ease of presentation, no function calls. It turns out that
and B will never “exist” at the same time when the
in this case CMA has a nice geometric interpretation.
program is running. Hence, a storage area of 40 bytes
A DSA input I consists of n triples of numbers, ie
is sufficient for both arrays, and the first solution can
I= {(Sl,~l,C1)7-.-, (sn,m,c,.,)}. Each triple (s~,Q,c~)
be improved by a factor of 2.
can be interpreted as an axis-parallel rectangle with
In general, given a source program .the compiler
a projection (ri,ci) on the z-axis and a projection of
can use control-flow analysis and similar techniques to
length si on the y-axis. We are only allowed to slide
determine pairs of (source-code) objects such that the
the rectangles along the y-axis while the s-projections
compiler is allowed to map the objects in each pair to
of all rectangles stay fixed as in the input I. The
overlapping memory regions. Now, assume that we have
objective is to pack all rectangles in a horizontal strip
derived the following information from a program in a
of minimum height. Intuitively speaking, (Si, Ti) Ci)
structured programming language: (1) the set of all
is associated with a source-code object that is used
(source-code) objects and their memory requirements;
only between the rith and the c&h statement and
(2) a set of pairs of objects such that the objects in a
requires Si contiguous memory cells. The previous
pair cannot “interfere” with each other at run time and,
research on fast DSA algorithms with a performance
hence, can share memory; (3) the associated control-
guarantee was based on on-line coloring, and results
flow graph, and, for any object, the vertices of the
were obtained independently by Woodall, Kierstead,
control-flow graph (ie source-code statements) at which
Chrobak, and Slusarek, cf [2] for references. The best
the object is accessed as well as the associated access
approximation ratio of 5 for this NP-hard problem is
type information. Then, the Compile-time Memory
based on a different approach and is due to Gergov [2].
In the remaining sections, we give a rather condensed
‘Max-Planck Institut ftir Informatik, Im Stadtwald, 66 123
Saarbriicken, Germany; E-Mail: [email protected] presentation of our results.
5908

2 Dynamic Storage Allocation in [l], each (source-code) object is associated with

Given a DSA instance I={(sl,rl~cl):. . :(sn,r,:cn)}, some of the vertices of F. Intuitively speaking, if an
a solution to I is a mapping Q from I to the nonneg- object is not associated with a specific vertex: then it
ative reals o:I--tRz, such that if (ri; ci)n(rj, cj)#O, does not need to be “maintained” at this vertex. We
then either i=j or o((s~,T~:c~))+s~~Q((s~:T~:c~)) or modify the instance 1 so that the following property
a((Sj; Tj, Cj))+Sj_<cY((S,: Ti, Ci)). The cost of Q is defined holds: (con) the vertices each object is associated with
as maxk(o((sk:rk,cI;))+sI;:r. Geometrically speaking, o induce a connected component in the undirected graph
maps each rectangle to the y-coordinate of its lower side. corresponding to F(1). The property (con) is crucial
The algorithm described below is based on the concept to Theorems 3.1 and 3.2. The conflict graph G(1)
of 2-allocation, cf [2]. It first constructs an infeasible of the ChlA instance .I is the graph whose vertices
mapping cy’ that is then transformed into a feasible so are the (source-code) objects, and there is an edge
lution cr to I. between two objects iff there is a vertex v of the control-
Incremental 2-Allocation Construction (MC): flow graph F(I) such that both objects are associated
Initially, we set: J to the input I; H (an auxiliary set of with v. We can view the control-flow graph F(I) of
triples) to { (0, mini ri7 mmj cj)}; and CX’ (a mapping from a program in a structured programming language as
a set of triples to RIO) to the empty-domain mapping. a series parallel (S.-P.) graph by introducing dummy
while(J # 0) { vertices and doing some edge reversals. Note that the
pick (‘w,zr,zr) E H such that w is minimal CMA definition assumes a program in a structured
among all triples in H and delete it from H; programming language. The nesting depth d(S) of a
if ( there exists (s: r, c) E J s.t. (r, c) n (zl, q) # 0 single vertex S.-P. graph S is I by definition. If an S.-P.
and, for all (w’,zi,z;) E H. (T,c) n (z;,z:) = 0) { graph S is the series (parallel) composition of two s-p.
pick and delete from J the (s,r,c) in the if test; graphs Sr and & , then its nesting depth d(S) is defined
set a’((s,r,c)) to w; as m=4d(Sd,d(Sd) (m=(Wd, d(W) + 1).
insert (w + s, max(zl,r), min(c, 2,)) into H; THEOREM 3.1. Let I be an instance of CMA. Given
if (2r < r) { insert (w,z~,r) into H; > any weighting of the vertices of G(I), a maximum
if (c < 2,) { insert (w,c,zr) into H; 3 3 //if weighted clique of G(I) can be built in polynomial time.
3 //while We obtain the next result using decomposition tech-
At this point, we have constructed a mapping a’ from I niques and a procedure closely related to IAC.
to RX. Now, define G(V,E) by V=I and E={(vi,vj)l THEOREM 3.2. Let F be the control-flow graph of a
i # 7, (Ti,Ci) fl (Tj,Cj) $ 0 # (Ol’(Vi)yQ’(Vi)+ Si) n CMA instance I. A solution to I of cost at most 3-d(F)
(o’(Vj),aC’(Vj) + Sj)} where vk denotes (sk,rk,ck)EI- times the optimum can be computed in polynomial time.
Order (s,r, C)EV in the same order as they have been
4 Conclusion
deleted from J, and construct a coloring f:V+{O, 1,. . .)
of G by means of First Fit. Finally, output cu:l-+$e We can view Theorems 2.1 and 3.2 as positive results
where a(ui)=o’(vi)+f(vi) mtij(o’(uj) + sj). // end about the approximability of the NP-hard interval col-
oring problem of two classes of intersection graphs, ie
Our C++ implementation of IAC has O(n logn)
of interval graphs and of intersection graphs related to
running time and handles a few thousand triples in less
CMA. Our approximation results are based on the con-
than one second on a SUN SPARC 4.
struction of suitable infeasible solutions, cf 2-allocations
THEOREM 2.1. Given a DSA instance I, the IAC algo- in [2], and on coloring of specific intersection graphs. As
rithm builds a solution Q to I of cost not exceeding 3 the next result shows, our approach can be used to de-
times the optimum cost in O(nlogn) time. signapproximation algorithms for different intersection
graph classesas well.
3 Compile-Time Memory Allocation
THEOREM 4.1. Given a weighted unit-disk graph, an
In this and the next sections, the definitions of the interval coloring of cost at most 7 times the optimum
notions in Sans Serif are standard and can be found can be constructed in linear time.
in the technical literature, cf [l] and [2]. Let us also References
point out that the control-flow analysis is not part
of CMA: poor analysis wiIl produce a CMA instance 1113. Fabri, Automatic storage optimization, ACM SIG-
with a short list of objects allowed to share memory, PLAN Notices, 14 (8), 1979,pp. 83-91.
PI J. Gergov, Approximation algorithms for dynamic stor-
and, hence, there will be lessopportunities for memory
age allocation, European Symposium on Algorithms
optimization. Now, assumethat F(I) is the control- (ESA’96), Springer, LNCS 1136, 1996,pp. 52-61.
flow graph of a CM.4 instance I. As it is shown

pp21 40
No ratings yet
pp21 40
20 pages
Bibop
No ratings yet
Bibop
17 pages
OOP Homework
No ratings yet
OOP Homework
6 pages
MRP Composite Types
No ratings yet
MRP Composite Types
61 pages
Applications, Advantages and Disadvantages of Array
No ratings yet
Applications, Advantages and Disadvantages of Array
11 pages
Data Memory Organization and Optimizations in Application-Specific Systems
No ratings yet
Data Memory Organization and Optimizations in Application-Specific Systems
13 pages
Competitive Programming
No ratings yet
Competitive Programming
5 pages
Programming Challenge for Coders
No ratings yet
Programming Challenge for Coders
9 pages
IFT312 - DataStruct - Algo
No ratings yet
IFT312 - DataStruct - Algo
33 pages
Analysis of Pointers and Structure
No ratings yet
Analysis of Pointers and Structure
15 pages
Design and Analysis of Algorithms: Time Space Trade Off
No ratings yet
Design and Analysis of Algorithms: Time Space Trade Off
6 pages
Preface of Algorithms
100% (1)
Preface of Algorithms
8 pages
Optimize Matrix Multiplication Utilizing Opencl Fpga Kernel
No ratings yet
Optimize Matrix Multiplication Utilizing Opencl Fpga Kernel
8 pages
(Abstract Data Types Using Arrays) : Fast NUCES - Department of Computer Science
No ratings yet
(Abstract Data Types Using Arrays) : Fast NUCES - Department of Computer Science
11 pages
CS-206 Data Structures and Algorithms Lecture 4
No ratings yet
CS-206 Data Structures and Algorithms Lecture 4
35 pages
Computation of Storage Requirements For Multi-Dimensional Signal Processing Applications
No ratings yet
Computation of Storage Requirements For Multi-Dimensional Signal Processing Applications
14 pages
Algorithms For Computing The Static Single Assignment Form
No ratings yet
Algorithms For Computing The Static Single Assignment Form
51 pages
Dynamic Programming in VLSI Applications
No ratings yet
Dynamic Programming in VLSI Applications
49 pages
VLSI Design Automation Suggestions
No ratings yet
VLSI Design Automation Suggestions
17 pages
Efficient SPM Utilization in Embedded Systems
No ratings yet
Efficient SPM Utilization in Embedded Systems
54 pages
REDO - 2 CD - PDF 2
No ratings yet
REDO - 2 CD - PDF 2
2 pages
14 Tools: 14.1 The "Code Coverage" Tool
No ratings yet
14 Tools: 14.1 The "Code Coverage" Tool
6 pages
Unit 1
No ratings yet
Unit 1
13 pages
Section 6 Notes With Solutions
No ratings yet
Section 6 Notes With Solutions
7 pages
Understanding Type Systems in Programming
No ratings yet
Understanding Type Systems in Programming
66 pages
ParTranspose Cluster03
No ratings yet
ParTranspose Cluster03
8 pages
Introduction to Data Structures
No ratings yet
Introduction to Data Structures
28 pages
Record&Arrays
No ratings yet
Record&Arrays
53 pages
Question Bank Unit 1 PDF
No ratings yet
Question Bank Unit 1 PDF
27 pages
Output 3
No ratings yet
Output 3
20 pages
Partitioning Dataflow Analyses Using Types
No ratings yet
Partitioning Dataflow Analyses Using Types
12 pages
Ds MODULE-1
No ratings yet
Ds MODULE-1
11 pages
Programming Considerations: Tianhe-2 (China)
No ratings yet
Programming Considerations: Tianhe-2 (China)
10 pages
REDO - 2 CD - PDF 3
No ratings yet
REDO - 2 CD - PDF 3
1 page
An Analysis and Review On Memory Management Algorithms For Real Time Operating System Vatsalkumar H
No ratings yet
An Analysis and Review On Memory Management Algorithms For Real Time Operating System Vatsalkumar H
5 pages
Ques Bank Updated
No ratings yet
Ques Bank Updated
2 pages
Most Asked Question (Coding)
No ratings yet
Most Asked Question (Coding)
6 pages
Design Issues in Character Strings
No ratings yet
Design Issues in Character Strings
60 pages
CCW CST308
No ratings yet
CCW CST308
6 pages
Algorithms and Data Structures For External Memory
No ratings yet
Algorithms and Data Structures For External Memory
191 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
73 pages
HW 6
No ratings yet
HW 6
3 pages
CG Lab Manual Comp
No ratings yet
CG Lab Manual Comp
27 pages
CSE207 Lab01, DSA
No ratings yet
CSE207 Lab01, DSA
3 pages
Infosys Written and Interviews Sheet
No ratings yet
Infosys Written and Interviews Sheet
4 pages
OOP Final Fall-2023
No ratings yet
OOP Final Fall-2023
10 pages
2023 I Puc Cs Solutions Midterm
No ratings yet
2023 I Puc Cs Solutions Midterm
17 pages
An Array
No ratings yet
An Array
10 pages
Lec-6,7 Arrays
No ratings yet
Lec-6,7 Arrays
24 pages
An Efficient Temporal Partitioning Algorithm To Minimize Communication Cost For Reconfigurable Computing Systems
No ratings yet
An Efficient Temporal Partitioning Algorithm To Minimize Communication Cost For Reconfigurable Computing Systems
11 pages
st20270256 PORT1 CMP4011
No ratings yet
st20270256 PORT1 CMP4011
7 pages
Data Structures and Algorithms: (CS210/ESO207/ESO211)
No ratings yet
Data Structures and Algorithms: (CS210/ESO207/ESO211)
21 pages
Allocating Primary Memory To Processes: Next
No ratings yet
Allocating Primary Memory To Processes: Next
12 pages
Data Structure Using C and C++ Basic
100% (1)
Data Structure Using C and C++ Basic
8 pages
Advanced Data Structures and Algorithms
No ratings yet
Advanced Data Structures and Algorithms
144 pages
Memory Allocation in C Programs
No ratings yet
Memory Allocation in C Programs
16 pages
Virtual Memory: Princeton University, Princeton, New Jersey
No ratings yet
Virtual Memory: Princeton University, Princeton, New Jersey
37 pages
LLM Peephole
No ratings yet
LLM Peephole
13 pages
Static Program Analysis
No ratings yet
Static Program Analysis
210 pages
LLM-based Vulnerability Discovery Through The Lens of Code Metrics
No ratings yet
LLM-based Vulnerability Discovery Through The Lens of Code Metrics
13 pages
Irre Writer
No ratings yet
Irre Writer
30 pages
Logcat 1712799233464
No ratings yet
Logcat 1712799233464
434 pages
Via Pillar Flow in Innovus: Product Version: Innovus 17.x March, 2018
No ratings yet
Via Pillar Flow in Innovus: Product Version: Innovus 17.x March, 2018
22 pages
Full Stack BootCamp 2025 New
100% (1)
Full Stack BootCamp 2025 New
37 pages
Intro to C and C++ Programming
No ratings yet
Intro to C and C++ Programming
53 pages
A Brief (F) Lex Tutorial
No ratings yet
A Brief (F) Lex Tutorial
13 pages
Lecture3 SQLQueryI WHERE
No ratings yet
Lecture3 SQLQueryI WHERE
35 pages
Ads 1
No ratings yet
Ads 1
24 pages
CTSD-Lab Mannual Final - 241204 - 102238
No ratings yet
CTSD-Lab Mannual Final - 241204 - 102238
54 pages
1.harold and His Homework:: Change All The Codes To Avoid Plagiarism
No ratings yet
1.harold and His Homework:: Change All The Codes To Avoid Plagiarism
14 pages
Formulir Template Data Karyawan
No ratings yet
Formulir Template Data Karyawan
3 pages
Real-Time 4K Face Detection with OpenCV
No ratings yet
Real-Time 4K Face Detection with OpenCV
19 pages
Chapter 1.1 - Create A Macro
No ratings yet
Chapter 1.1 - Create A Macro
4 pages
Cobol 74 Vol2
No ratings yet
Cobol 74 Vol2
439 pages
Dilshad Khan 9711492719
No ratings yet
Dilshad Khan 9711492719
1 page
Term Paper Group - 7 Bottom Up Evolution of Inherited Attributes
No ratings yet
Term Paper Group - 7 Bottom Up Evolution of Inherited Attributes
6 pages
Physical Inventory MM
No ratings yet
Physical Inventory MM
20 pages
Primefaces: Next Generation Component Suite
No ratings yet
Primefaces: Next Generation Component Suite
49 pages
Pointer To Class-2
No ratings yet
Pointer To Class-2
20 pages
Routine - SPL Supply Exam-Sept 2023
No ratings yet
Routine - SPL Supply Exam-Sept 2023
26 pages
Advanced Optimization Technique
100% (1)
Advanced Optimization Technique
806 pages
Professional Summary:: Ramana : +919677095107
No ratings yet
Professional Summary:: Ramana : +919677095107
4 pages
UI Design & Dev with Figma & React
No ratings yet
UI Design & Dev with Figma & React
72 pages
Ac Datatypes Reference
No ratings yet
Ac Datatypes Reference
56 pages
CIE 2 (Unit 2)
No ratings yet
CIE 2 (Unit 2)
34 pages
Assembly Language: Alice Andrea Briceño Murcia
No ratings yet
Assembly Language: Alice Andrea Briceño Murcia
24 pages
T - Grade 8 - Python Notes
No ratings yet
T - Grade 8 - Python Notes
10 pages
HACKERRANK
No ratings yet
HACKERRANK
9 pages
Introduction to ASP.NET Programming
No ratings yet
Introduction to ASP.NET Programming
12 pages
1 CPE 413 Overview of x86 Architecture-1
No ratings yet
1 CPE 413 Overview of x86 Architecture-1
60 pages
Developer Portfolio: Alex Denisov
No ratings yet
Developer Portfolio: Alex Denisov
3 pages

Stack Color

Uploaded by

Stack Color

Uploaded by

5907

Algorithms for Compile-Time Memory Optimization

Abstract AZlocation problem (CMA) is to compute a mapping

2 Dynamic Storage Allocation in [l], each (source-code) object is associated with

You might also like