0% found this document useful (0 votes)

73 views7 pages

Unit 2hardware-Software - Partitioning - in - Embedded - System

EMBEDDED SYSTEM

Uploaded by

vanithapremkumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views7 pages

Unit 2hardware-Software - Partitioning - in - Embedded - System

EMBEDDED SYSTEM

Uploaded by

vanithapremkumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: [Link]

net/publication/4060774

Hardware-software partitioning in embedded system design

Conference Paper · October 2003

DOI: 10.1109/ISP.2003.1275838 · Source: IEEE Xplore

CITATIONS READS

95 353

5 authors, including:

Péter Arató Zoltan Mann

Budapest University of Technology and Economics University of Duisburg-Essen
51 PUBLICATIONS 251 CITATIONS 70 PUBLICATIONS 528 CITATIONS

SEE PROFILE SEE PROFILE

Dávid Papp
North Carolina State University
49 PUBLICATIONS 359 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

RestAssured - Secure Data Processing in the Cloud View project

High Level Synthesis of Pipelined Datapaths View project

All content following this page was uploaded by Péter Arató on 22 May 2014.

The user has requested enhancement of the downloaded file.

Hardware-software partitioning in embedded system design

Péter Arató, Sándor Juhász, Zoltán Ádám Mann, András Orbán, Dávid Papp
Budapest University of Technology and Economics
Department of Control Engineering and Information Technology
H-1117 Budapest, Magyar tudósok körútja 2, Hungary
arato@[Link], jsanyi@[Link], {[Link], [Link], pappd}@[Link]

Abstract – One of the most crucial steps in the design of embedded new problems, e.g. related to the communication of hardware
systems is hardware-software partitioning, i.e. deciding which and software components, as well as system architecture
components of the system should be implemented in hardware issues. In order to address these problems, hardware-software
and which ones in software. In this paper, different versions of
the partitioning problem are defined, corresponding to real-time co-design (HSCD) methods have to be used [3].
systems, and cost-constrained systems, respectively. The authors One of the most crucial design steps in HSCD is

problems: it is proven that they are

provide a formal mathematic analysis of the complexity of the
-hard in the general case,
partitioning, i.e. deciding which components of the system
should be realized in hardware and which ones in software.
and some efficiently solvable special cases are also presented. An Clearly, this is the step in which the above-mentioned optimal
ILP (integer linear programming) based approach is presented that
can solve the problem optimally even for quite big systems, and a
trade-off between cost and performance is to be found.
genetic algorithm (GA) that finds near-optimal solutions for even Therefore, partitioning has dramatic impact on the cost and
larger systems. A specialty of the GA is that non-valid individuals performance of the whole system [4].
are also allowed, but punished by the fitness function. Traditionally, partitioning was carried out manually. How-
ever, as the systems to design have become more and more
Keywords – genetic algorithm, graph partitioning,
hardware/software codesign, hardware/software partitioning, complex, this method has become infeasible, and many
integer linear programming research efforts have been undertaken to automate partitioning
as much as possible.

I. INTRODUCTION II. PREVIOUS WORK

Today’s computer systems typically consist of both One of the most relevant works is presented in [5], [6]: a
hardware and software components. For instance in an very sophisticated integer linear programming model for the
embedded signal processing application it is common to use joint partitioning and scheduling problem for a wide range of
both application-specific hardware accelerator circuits and target architectures. This integer program is part of a 2-phase
general-purpose, programmable units with the appropriate heuristic optimization scheme which aims at gaining better and
software [1], [2]. better timing estimates using repeated scheduling phases, and
This is beneficial since application-specific hardware is using the estimates in the partitioning phases.
usually much faster than software, but it is also significantly [7] presents a method for allocation of hardware resources
more expensive. Software on the other hand is cheaper to for optimal partitioning. During the allocation algorithm,
create and to maintain, but slow. Hence, performance-critical an estimated hardware/software partition is also built. The
components of the system should be realized in hardware, and algorithm for this is basically a greedy algorithm: it takes
non-critical components in software. This way, a good trade- the components one by one, and allocates the most critical
off between cost and performance can be achieved. building block of the current component to hardware. If the
However, this kind of system design is not without chal- finite automaton realizing the control of the component also
lenges. Usual hardware and software design methodologies fits into hardware, then that component is moved to hardware.
are in many aspects inadequate for such design tasks. The An even more heuristic approach is described in [8].
composition of hardware and software elements also creates This paper deals with a partitioning problem, in which even
the cost function to be optimized is a very complicated,
This work was supported by OTKA T043329. This paper appeared in
Proceedings of the 2003 IEEE International Symposium on Intelligent Signal heuristically weighted sum, which tries to take into account
Processing. several optimization criteria. The paper also describes two
heuristic algorithms: one based on simulated annealing and Hardware implementation of a component is associated
one based on tabu search. with a hardware cost, which can be for instance area, heat
[9] shows an algorithm to solve the joint problem of dissipation, energy consumption etc.
partitioning and scheduling. It consists of basically two local Since hardware is significantly faster than software, the
search heuristics: one for partitioning and one for scheduling. running time of components implemented in hardware is
The two algorithms operate on the same graph, at the same taken to be zero.
time. If two communicating components are mapped to differ-
[10] considers partitioning in the design of ASIPs ent contexts, this is associated with a (time dimensional)
(application-specific integrated processors). It presents a communication overhead. If two components are mapped
formal framework and proposes a partitioning algorithm based to the same context, then the communication between
on branch and bound. them is neglected.
[11] presents an approach that is largely orthogonal to One of the most important advantages of this simplified
other partitioning methods: it deals with the problem of model is that scheduling does not have to be addressed
hierarchically matching tasks to resources. It also shows explicitly. Hardware components do not have to be scheduled,
a method for weighting partially defined user preferences, because their running time is assumed to be zero. Software
which can be very useful for multiple-objective optimization components do not have to be scheduled because there is only
problems. one processor, so that the overall running time will be the sum
of the running times of the software components, regardless
III. PROBLEM DEFINITION of their schedule. Therefore, we can completely decouple the
partitioning problem from the scheduling problem, and focus
We observed that previous works tried to capture too many solely on the partitioning problem.
details of the partitioning problem and the target architecture.
B. Formal problem definition

, ,
Therefore, the solutions proposed in the literature

!An#undirected
"%$ &(' andsimple
)
#"*$ &+'
do not scale well for large inputs but they fall victim to
graph
combinatorial explosion; and/or
are given. is the so-called
are heuristic in nature, and drift away too much from
task graph of the system, its nodes are the components of the
,.- - .-
optimal solutions.
system that have to be partitioned, and the edges represent
Moreover, the complexity of the problem is hard to determine.
/,- 0 -
communication between the components. (or ) and
Therefore, our main goal was to introduce a simplified
, - 1323
- 32 )
(or ) denote the software and hardware cost of node ,
model for partitioning, using which we can respectively, while denotes the communication cost
define different versions of the partitioning problem
4
between and if they are in different contexts (HW or SW).
formally;
4 56879:8;<
is called a hardware-software (HW-SW) partition if 87>=
analyze the complexity of the problems formally;
<;?@ 87#AB8;C@D
it is a bipartition of : , where
give algorithms that can be used for the design of large

+EFG ., - 32HI -J ; 32 J 7 -J 7 32 J ;
and . The crossing edges are:
systems as well.
4 K EL#MNOQPSRUTV -
or . The
Note that algorithms published previously in the literature
have been tested on systems with some dozens of components.
4 W X
E
hardware cost of is:
of is:
M NOPSRZY -[M]\ NOQ^ N_`P:ab ) - c323
; the software cost
, i.e. the
Our aim was to make our algorithms scalable for systems with software cost of the nodes and the communication cost; since
hundreds or even thousands of components, so that they can both are time-dimensional, it makes sense to add them, and
indeed be used in practical, large-scale projects. For these together they make up the running time of the system. The
reasons we had to keep the model as simple as possible, )
following optimization and decision problems can be defined

K9d Wed EgJ f $ & ' 4

only taking into account the most important properties of the ( , , , are given in all problems):

W Egf W 4 d
partitioning problem.
K K9d
P1: , are given. Is there a HW-SW partition

KE 9fd J $ &+' E
so that and ?
A. Informal model
K J K $&d ' W
P2: is given. Find a HW-SW partition so that

Wd
and 4
is minimal. (Cost-constrained systems)

W Ehf Wed K E
The characteristics of our model are the following: P3: is given. Find a HW-SW partition so that
We consider only one software context (i.e. one general- and is minimal. (Systems with hard real-
purpose processor) and one hardware context (e.g. one time constraints)
FPGA). The components of the system have to be mapped
to either one of these two contexts. C. Complexity
Software implementation of a component is associated
with a software cost, which is the running time of the Theorem 1. P1 is ihj
-complete even if only graphs with no
component if implemented in software. edges are considered.
Proof. P1
J ihj , since if partitioning is possible with the It can be proven in the same way that

To prove the ihj

given limits, then the partition itself is a good proof for this.
-hardness, we reduce the K NAPSACK
Theorem 3. P3 is ihj -hard.
problem [12] to P1. Let an instance of the K NAPSACK problem Although the general partitioning problem is too hard for
-
be given. (There are objects, the weights of the objects are
denoted by , the price of the objects by , the weight limit
- cheap, i.e. ) , - 323
large inputs, some special
cases are easier. If communication is
, then the partitioning problem reduces
by and the price limit by . The task is to decide whether M -P - f according to Remark 1 to the well-known K NAPSACK prob-
M -P -
there a subset of objects, so that
is and lem, for which efficient pseudo-polynomial algorithms [12]
?> Se ?

]D
.) Based on this, we define an instance of P1 0-H - -H - and approximation algorithms [13] are known.

as follows: , . Let , .
) On the other
-
hand, -
if communication is the only significant

M N O PSR - Wd K d
(Since is empty,
, let
there is
,
no need to

define
.
.) Introducing factor, i.e.

, then the trivial optimal solution is
to put every node to software. However, if there are some
We state that this instance of P1 is solvable iff the original
4 <79<;e
predefined constraints D ;
considering
the context of some nodes
K NAPSACK problem has a solution.
7 = ; B 7 A ; LD
(i.e. the nodes in 7 are prescribed to be in software 7 A ; BD

; 07
Assuming that P1 has a solution: , where and the ones in to be in hardware, where )

7 D
and . This means that the problem reduces to finding the minimal weighted - -cut

W E N O PSR Y - f (1)

in a graph, where and represent the
respectively. (If
and
, then it reduces to finding a minimal
sets,

weighted cut.) This can be solved in polynomial time [14].

and

K E N O PSR T -Hf N O PSR - IV. ILP-BASED ALGORITHM

The following ILP (integer linear programming) solution

J , $ & ' ) J , $ & '

is appropriate for the P3 problem, but it is straightforward to
the last one can also be formulated as: adopt it to the other versions of the partitioning problem.
f N O PSR - N O PSR T - N O PSR Y - "!

J 0
are the vectors representing the
(2) cost functions %
( $ &is the number of nodes, # is the number of
$ !('
#e; is a solution of the original
edges).
of , that is
is the transposed incidence matrix
(1) and (2) proves that
%$ + ,
K NAPSACK problem.
Let now assume that solves the K NAPSACK problem.
c
*) + &,- /0 1. $ if edge + starts in node,
if edge + ends in node
Therefore: ,

NOQP - N OQP -f Wed J 30 $

if edge is not incident to node

(3)
Let 2 be a binary vector indicating the partition, i.e.

and 2
4
) +&- 5

$ +
if node + is realized in hardware

N O P - K d N O PSR - K d
if node is realized in software

It can be seen that the components of the vector 6 276 indicate

which edges cross the boundary between the two contexts.

that is

K d N O PSR - N O P - N OPS R -/ NOQPS R 8-

(The absolute value is taken component-wise.) So the problem
89;:
can be formulated as follows:
(4)
2
4 solves P1.
(5a)
(3) and (4) verifies that
=< [ )>6
276 f W d
Remark 1. The above proof shows that the special case of the
P1 problem in which the graph has no edges is equivalent with
2
2
J 3 $ (5b)
(5c)
the K NAPSACK problem.
$ $
ihj In Equation (5b) 1 means the -dimensional vector.
J $&
Theorem 2. P2 is -hard.
The (5a)-(5c) problem can be transformed
! to an ILP equivalent

K E f Kd WEE f WE
Proof. P1 can be reduced to P2: P2 provides a solution where by introducing the variables8? 9;: to eliminate the 6A@36 :
and
is solvable iff
is minimal; let this value be . Clearly P1
. W Wed 2 (6a)
< [ ) ? f eW d
2

2 f ? (6b) Generating random valid individuals with approximately

f ? (6c) uniform distribution would be a promising alternative, but it is
by no means obvious how one could implement such a scheme.
2
2
J 30 $ (6d)
(6e)
Therefore we chose to fill the initial population partly with
randomly selected, not necessarily valid individuals, and partly
with valid individuals generated by a fast greedy algorithm.

The last two programs are equivalent. If 2 solves (5b)-(5c), This way, there are valid individuals, but also a wide variety
then 2 6 276 solves (6b)-(6e). On the other hand, if 2 ?
then 2 will solve (5b)-(5c) too, since ?

6 276
of other individuals in the initial population. Clearly, the ratio
)
solves (6b)-(6e),
between the two kinds of individuals in the initial population is
and . a crucial parameter of the GA. The tests have shown that about
Hence (6a)-(6e) is the ILP formulation of the P3 problem. one third of the individuals in the initial population should be
We solve this integer program using LP-relaxation and branch- chosen randomly.
and-bound [15]. Fitness function. Since we mainly focused on the P3
problem, the objective is to minimize hardware cost. However,
V. GENETIC ALGORITHM since invalid individuals should be punished, the fitness
a second component: the measure of invalidity,
+ e 4
has
,
Although the ILP-based solution is efficient for graphs with
defined as the amount by which software cost (including
up to some hundreds of nodes, and it produces the exact
communication cost) exceeds the software limit (and 0 for a
optimum, it cannot be used to partition even bigger graphs. For
valid partition).
this purpose, we also developed a genetic algorithm (GA). For
We tried several versions for the fitness function, which
GA in general, see [16], [17] and references therein.
fall basically into two categories: those ranking every valid
Individuals. The partitioning problem is fortunate from individuals in front of invalid ones, and the less rigorous ones,
the point of view of a genetic algorithm. The applicability that allow invalid individuals to beat valid ones. Later in our
of genetic algorithms requires that the solutions of the tests a rigorous version proved to be best:
optimization problem can be represented by means of a
H 4 K EE [ + e 4 [ if 44 is valid
K )
vector with meaningful components: this is the condition for
recombination to work on the actual features of a solution. if is invalid

where K
E is the hardware cost, ) is an appropriate constant
is a sufficiently large constant
Fortunately, there is an obvious vector representation in the
case of the partitioning problem: each partition is a bit vector
just like in the ILP program. determined by the tests, and
that makes each invalid individual have a higher fitness
Population. The population is a set of individuals. The
than any valid individual. (Note that the fitness has to be
question is whether non-valid individuals, i.e. those violating
minimized.)
for instance the software limit in case of P3, should also be
Genetic operations. Mutation, recombination and selection
allowed in the population. Since non-valid individuals violate
are used.
some important design constraint, it seems to be logical at
Mutation is done in the new population; each gene of each
first glance to work with valid partitions only. However,
individual is altered with the same probability.
this approach would have several drawbacks: first, invalid
In the case of recombination we tested both one-point-
individuals may contain valuable patterns that should be
crossover and two-point-crossover. Moreover, we tested two
propagated, and second, it is hard to guarantee that genetic
schemes for choosing the individuals for crossover: in the first
operations do not generate non-valid individuals even from
one, all individuals are chosen with the same probability, in the
valid ones. This holds for both mutation and recombination.
second one, the probability of choosing a particular individual
For these reasons we decided to allow non-valid individuals as
depends on its fitness, so that better individuals are chosen with
well in the population. Of course the GA must produce a valid
a higher probability. So we tested four different versions of the
partition at the end, so we must make sure to insert some valid
recombination operator. According to the test results, the best
individuals into the initial population, and choose the fitness
strategy is to choose individuals with probabilities determined
function in such a way that it punishes invalidity.
by their fitness, and to use two-point-crossover (although one-
Our tests showed that the population size should be around point-crossover is not much worse).
300. Selection is usually realized as filling some part of the new
Initial population. In order to guarantee diversity in the population with the best individuals of the old population.
population, the initial population usually consists of randomly However, since some versions of the fitness functions rank
chosen individuals. However, this method does not guarantee all invalid individuals behind the valid ones, this would mean
that there will be valid individuals in the initial population—in that invalid individuals are completely omitted from selection.
fact, if the problem space is big and constraints are tight, then Therefore, a given ratio of the selected individuals is taken
the chances are very low for this. from the invalid individuals.
Stopping criteria. The GA takes a given minimum number of x 10
4

the best$
steps (generations). After that, it stops after 2 steps if 5
ILP

found partition does not improve in the last 2 steps.

GA
4.5

is also a parameter.
4

VI. EMPIRICAL RESULTS 3.5

Cumulative running time in sec

The aim of our tests was threefold: 3

1. the limit of the applicability of the ILP algorithm had to 2.5

be identified;
2. the parameters of the genetic algorithm had to be tuned; 2

3. the quality of the solutions found by GA had to be 1.5

determined.
1
For the latter purpose the optimal solution of the ILP
algorithm was used as reference in those problem instances that 0.5

both algorithms were able to solve in acceptable time. We have

0
tested the ILP-based and the genetic partitioning algorithm on IDEA (112 nodes) RC6 (324 nodes)
Testcase
MARS (412 nodes)

both industry benchmarks and random problem instances.

Fig. 1. The cumulative running time of the algorithms on benchmark
First, we tuned the parameters of the genetic algorithm problems
using random problem instances of varying size (400–1000
nodes). We varied the type of recombination, the fitness
function, the constants in the fitness function, the size of 5000
the population, the ratio of random individuals in the initial ILP
GA
4500
population, the mutation rate, selection rate, percentage of
invalid individuals from the selected ones, and the constants 4000

in the stopping criteria. Because of this huge number of

Cumulative cost of found solution

3500
parameters, we could not test all configurations. Rather, we
first tuned the less sensitive parameters, and fixed them to the 3000

values that seemed best. We tested each configuration on 27 2500

problem instances: 3 different problem sizes, 9 runs each, and
we used the average of the 9 measurements for comparison. 2000

After having tuned the parameters of the GA, we fixed them 1500
to the best found configuration, and moved on to compare it
1000
with the ILP algorithm on industry benchmarks.
The benchmarks we used are cryptographic algorithms 500

(IDEA, RC6, MARS) of different complexity. Concerning the

0
communication costs we tested two scenarios in every bench- IDEA (112 nodes) RC6 (324 nodes)
Testcase
MARS (412 nodes)

mark: communication dominated, in which the communication

cost values relative to the software costs are very high, and Fig. 2. The cumulative cost of the solutions found by the algorithms on
benchmark problems
processing dominated, in which the communication cost is in
the same order of magnitude as the software costs. In both
scenarios and in each benchmarks we tried seven different
software limits. Altogether there were 42 test cases. thousands of nodes in acceptable time. On a Pentium II PC,
An overview of the results can be seen in Figures 1 and 2. the GA can solve a problem with 2000 nodes within an hour,
Figure 1 shows the cumulative running time of the algorithms roughly the same running time that the ILP produces for a
on each benchmark. The values on the y-axis mean the sum graph with about 300 nodes.
of the execution times of all tests in seconds. According to The quality of the found solution can be seen in Figure 2.
our expectations the running time of the ILP-solution grows This is the cumulative cost of the found solutions in each
exponentially with the problem size, in contrast with the benchmark. Since the ILP always finds the optimum, we
slowly growing execution time of the GA. It can be seen that were interested in the deterioration of the result of GA. The
the ILP can solve problems with up to some hundreds of figure shows that, on average, GA could find a solution close
nodes in acceptable time, but becomes impractical on larger to the optimum. The exact deviances are: 5.4%, 16.5% and
instances. We tested the GA on big random problems (random 17.8%. On larger tests we could not compare the result of GA
instances have the advantage that they can be generated and with the result of ILP, since the latter was unable to terminate
scaled easily) and found that it can solve problems of several in acceptable time, but we expect the GA to finish not very
200 has made it possible to investigate the complexity of the
IDEA processing dominated

ihj
IDEA communication dominated problem formally. In particular, we have proven that the
190 MARS processing dominated
MARS communication dominated problem is -hard in the general case, but we could also
RC6 processing dominated
180 RC6 communication dominated identify some efficiently solvable meaningful special cases.
Moreover, we presented two new partitioning algorithms: one
170
based on ILP, and a genetic algorithm. It turned out in our
160 empirical tests that the ILP-based solution works efficiently
Ratio in %

150
for graphs with several hundreds of nodes and yields optimal
solutions, whereas the genetic algorithm gives near-optimal
140
solutions on average, and works efficiently with graphs of even
130 thousands of nodes. Moreover, we observed an easy-hard-easy
pattern in the performance of GA, and wild oscillations in the
120
running time of the ILP algorithm.
110 Our future plans include randomization of our ILP-based
algorithm, developing better bounds for the branch-and-bound
100
1 2 3 4
Testcase
5 6 7 scheme used in the ILP algorithm, as well as the inclusion of
some simple scheduling methods into our partitioning model.
Fig. 3. The ratio of the results of GA and the optimum
REFERENCES
[1] P. D. Randall, “Hardware/software co-design and digital signal
processing,” MSc thesis, University of Oxford, May 1996.
far from the optimum. However, the gap increases with the [2] P. Arató, Z. Á. Mann, and A. Orbán, “Hardware-software co-design
problem size. for Kohonen’s self-organizing map,” in Proceedings of the IEEE 7th
Figure 3 shows the deviation of the result of GA from the International Conference on Intelligent Engineering Systems, 2003.
[3] R. K. Gupta, Co-Synthesis of Hardware and Software for Digital
optimum in detail. On the x-axis the different tests according to Embedded Systems, PhD thesis, Stanford University, December 1993.
the software limit, on the y-axis the ratio of the solution found [4] Z. Á. Mann and A. Orbán, “Optimization problems in system-level
by GA and the optimum can be seen. In test-case 1 and 7 the synthesis,” in Proceedings of the 3rd Hungarian-Japanese Symposium
on Discrete Mathematics and Its Applications, 2003.
software limit was chosen in such a way that the optimum is [5] R. Niemann and P. Marwedel, “An algorithm for hardware/software
to put every component in software or hardware, respectively. partitioning using mixed integer linear programming,” Design
It can be seen that GA finds these extremes in every case. Automation for Embedded Systems, special issue: Partitioning Methods
for Embedded Systems, vol. 2, pp. 165–193, March 1997.
Another consequence of the figure is that the difficulty of the [6] R. Niemann, Hardware/Software Co-Design for Data Flow Dominated
problem depends strongly on the software limit. Tests 5 and Embedded Systems, Kluwer Academic Publishers, 1998.
6 seemed to be the hardest problems, where in the worst case [7] J. Grode, P. V. Knudsen, and J. Madsen, “Hardware resource
allocation for hardware/software partitioning in the LYCOS system,” in
only an approximation ratio of 2 could be achieved. The best Proceedings of Design Automation and Test in Europe (DATE ’98), 1998.
results were found in the ’IDEA, communication-dominated’ [8] P. Eles, Z. Peng, K. Kuchcinski, and A. Doboli, “Hardware/software
scenario, in which all the seven test cases resulted in the partitioning of VHDL system specifications,” in Proceedings of EURO-
DAC ’96, 1996.
optimal solution. [9] K. S. Chatha and R. Vemuri, “MAGELLAN: Multiway hardware-
The easy-hard-easy pattern that can be recognized in software partitioning and scheduling for latency minimization of
hierarchical control-dataflow task graphs,” in Proceedings of CODES
Figure 3 is in accordance with previous findings in the 01, 2001.
literature [18]. We could expect a similar pattern in the running [10] N. N. Binh, M. Imai, A. Shiomi, and N. Hikichi, “A hardware/software
time of the ILP algorithm; however, this is not true. In partitioning algorithm for designing pipelined ASIPs with least gate
counts,” in Proceedings of the 33rd Design Automation Conference,
fact, the running time of the ILP algorithm oscillates wildly. 1996.
(In contrast, the running time of the GA is quite consistent.) [11] G. Quan, X. Hu, and G. Greenwood, “Preference-driven hierarchical
This might be caused by the previously observed heavy-tailed hardware/software partitioning,” in Proceedings of the IEEE/ACM
International Conference on Computer Design, 1999.
runtime distribution of search algorithms [19]. Actually, it [12] C. H. Papadimitriou, Computational complexity, Addison Wesley, 1994.
is possible that the ILP algorithm exhibits a similarly clear [13] D. S. Hochbaum, Ed., Approximation Algorithms for NP-Hard
easy-hard-easy pattern on average, but this phenomenon is Problems, PWS Publishing, Boston, MA, 1997.
[14] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network Flows: Theory,
hidden because of the large deviation caused by the heavy- Algorithms, and Applications, Prentice-Hall, 1993.
tailed property. In order to decide this, we are planning to [15] A. Schrijver, Theory of linear and integer programming, Wiley, 1998.
[16] L. Davis, Handbook of genetic algorithms, Van Nostran Reinhold, 1991.
implement a randomized ILP algorithm, and test its average [17] W. Kinnebrock, Optimierung mit genetischen und selektiven Algorith-
behaviour. men, Oldenburg, 1994.
[18] D. Achlioptas, C. P. Gomes, H. Kautz, and B. Selman, “Generating
VII. CONCLUSION satisfiable instances,” in Proceedings of the Seventeenth National
Conference on Artificial Intelligence, 2000.
[19] C. P. Gomes, B. Selman, N. Crato, and H. Kautz, “Heavy-
In this paper, we have introduced a new, simplified model tailed phenomena in satisfiability and constraint satisfaction problems,”
for the hardware-software partitioning problem. This model Journal of Automated Reasoning, vol. 24, no. 1-2, pp. 67–100, 2000.

View publication stats

Ec8791 - Erts Syllabus
0% (1)
Ec8791 - Erts Syllabus
2 pages
EE8681-Microprocessors and Microcontrollers-Lab Manual-Converted11
100% (1)
EE8681-Microprocessors and Microcontrollers-Lab Manual-Converted11
124 pages
EE8681-Microprocessors and Microcontrollers-Lab Manual-Converted11
78% (9)
EE8681-Microprocessors and Microcontrollers-Lab Manual-Converted11
124 pages
Important For CR
No ratings yet
Important For CR
63 pages
Unit 2hardware-Software - Partitioning - in - Embedded - System
No ratings yet
Unit 2hardware-Software - Partitioning - in - Embedded - System
7 pages
Ap7202-Asic and Fpga
No ratings yet
Ap7202-Asic and Fpga
12 pages
Linear Integrated Circuits Guide
No ratings yet
Linear Integrated Circuits Guide
1 page
Ec8651 Transmission Lines and RF Systems
50% (2)
Ec8651 Transmission Lines and RF Systems
1 page
Important For CR
No ratings yet
Important For CR
63 pages
HW SW Unit 1
No ratings yet
HW SW Unit 1
7 pages
Communication Systems Lab Guide
50% (2)
Communication Systems Lab Guide
1 page
Digital Signal Processing Lab Guide
100% (1)
Digital Signal Processing Lab Guide
1 page
Analog and Digital Circuits Lab Guide
No ratings yet
Analog and Digital Circuits Lab Guide
1 page
B.E. Aero PDF
No ratings yet
B.E. Aero PDF
102 pages
Digital Systems Laboratory Overview
No ratings yet
Digital Systems Laboratory Overview
1 page
M.E Applied Notes 1
No ratings yet
M.E Applied Notes 1
56 pages
Ec8651 Transmission Lines and RF Systems
50% (2)
Ec8651 Transmission Lines and RF Systems
1 page
B.E. Aero PDF
No ratings yet
B.E. Aero PDF
102 pages
M.E Advanced-Digital-System-Design
No ratings yet
M.E Advanced-Digital-System-Design
2 pages
Microprocessor Lab for Engineers
100% (1)
Microprocessor Lab for Engineers
2 pages
Bbi Model QP
No ratings yet
Bbi Model QP
2 pages
Digital Signal Processing by C Ramesh Babu Durai PDF
0% (1)
Digital Signal Processing by C Ramesh Babu Durai PDF
303 pages
CR Notes
No ratings yet
CR Notes
22 pages
Bipolar Junction Transistor Basics
No ratings yet
Bipolar Junction Transistor Basics
18 pages
MC & MP Notes
No ratings yet
MC & MP Notes
121 pages
Gummen Pool
No ratings yet
Gummen Pool
29 pages
EDC Lab Manual
50% (2)
EDC Lab Manual
88 pages
Image Compression
No ratings yet
Image Compression
15 pages
Unit 2 - Audio and Video Compression
100% (3)
Unit 2 - Audio and Video Compression
59 pages
Introduction To Linear Programming
No ratings yet
Introduction To Linear Programming
10 pages
Application of Fuzzy Linear Programming and Branch and Bound Methods in Optimizing Production Raw Material Supply (Case Study Berkah Coffee Shop)
No ratings yet
Application of Fuzzy Linear Programming and Branch and Bound Methods in Optimizing Production Raw Material Supply (Case Study Berkah Coffee Shop)
6 pages
Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras
No ratings yet
Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras
38 pages
Keynote Kusudama PDF
No ratings yet
Keynote Kusudama PDF
104 pages
Acc 421 Sim (Week 1-3)
No ratings yet
Acc 421 Sim (Week 1-3)
22 pages
Syllabus of MSC Mathematics CUJ
No ratings yet
Syllabus of MSC Mathematics CUJ
30 pages
Linear Programming Worksheet Class XI
No ratings yet
Linear Programming Worksheet Class XI
2 pages
Platonic Solid
No ratings yet
Platonic Solid
13 pages
1-3 Points Lines and Planes
No ratings yet
1-3 Points Lines and Planes
4 pages
OR Long Answers
No ratings yet
OR Long Answers
7 pages
Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 8th Edition Ragsdale Solutions Manual
No ratings yet
Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 8th Edition Ragsdale Solutions Manual
44 pages
Optimal Design of Water Distribution Networks
100% (1)
Optimal Design of Water Distribution Networks
14 pages
Quantitative Methods For Business 13th Edition Online Reading
No ratings yet
Quantitative Methods For Business 13th Edition Online Reading
111 pages
M.Tech Syllabus: Power Electronics & Systems
No ratings yet
M.Tech Syllabus: Power Electronics & Systems
55 pages
Multi-Criteria Decision Making and Optimization: Ankur Sinha
No ratings yet
Multi-Criteria Decision Making and Optimization: Ankur Sinha
36 pages
A Study of Systems of Neutrosophic Linear Equations
No ratings yet
A Study of Systems of Neutrosophic Linear Equations
10 pages
M.Phil Mathematics Course Syllabus
No ratings yet
M.Phil Mathematics Course Syllabus
14 pages
Page 1 - GR 4 Math Page 2 - GR 4 Guided Study
No ratings yet
Page 1 - GR 4 Math Page 2 - GR 4 Guided Study
7 pages
Linear Programming for Business
100% (1)
Linear Programming for Business
5 pages
Real-World Applications of Linear Programming
No ratings yet
Real-World Applications of Linear Programming
17 pages
W6 Lesson 5 - Introduction To Optimization and Linear Programming - Module
No ratings yet
W6 Lesson 5 - Introduction To Optimization and Linear Programming - Module
21 pages
ORHandout 2 The Simplex Method
No ratings yet
ORHandout 2 The Simplex Method
16 pages
Simplex Method in Linear Programming
No ratings yet
Simplex Method in Linear Programming
38 pages
Bahrain University Ch4 Qm-350
No ratings yet
Bahrain University Ch4 Qm-350
46 pages
Linear Programming Geometry Overview
No ratings yet
Linear Programming Geometry Overview
45 pages
MBA(FM) Syllabus Overview
No ratings yet
MBA(FM) Syllabus Overview
40 pages
Linear Programming Questions With Solutions1 2
67% (3)
Linear Programming Questions With Solutions1 2
9 pages
Linear Programming: Elita Amrina, PH.D
No ratings yet
Linear Programming: Elita Amrina, PH.D
28 pages
Decision Science - MCQ UOP
82% (28)
Decision Science - MCQ UOP
43 pages
Aggregate Planning SureStep Complete
No ratings yet
Aggregate Planning SureStep Complete
3 pages

Unit 2hardware-Software - Partitioning - in - Embedded - System

Uploaded by

Unit 2hardware-Software - Partitioning - in - Embedded - System

Uploaded by

See discussions, stats, and author profiles for this publication at: [Link]

Hardware-software partitioning in embedded system design

Conference Paper · October 2003

Péter Arató Zoltan Mann

SEE PROFILE SEE PROFILE

RestAssured - Secure Data Processing in the Cloud View project

High Level Synthesis of Pipelined Datapaths View project

The user has requested enhancement of the downloaded file.

problems: it is proven that they are

I. INTRODUCTION II. PREVIOUS WORK

K9d Wed EgJ f $ & ' 4

To prove the ihj

W E N O PSR Y - f (1)

weighted cut.) This can be solved in polynomial time [14].

K E N O PSR T -Hf N O PSR - IV. ILP-BASED ALGORITHM

   J , $ & '   ) J , $ & '

NOQP  - N OQP -f Wed J 30 $  

which edges cross the boundary between the two contexts.

K d  N O PSR - N O P - N OPS R -/ NOQPS R 8-

found partition does not improve in the last 2 steps.

VI. EMPIRICAL RESULTS 3.5

Cumulative running time in sec

1. the limit of the applicability of the ILP algorithm had to 2.5

3. the quality of the solutions found by GA had to be 1.5

both algorithms were able to solve in acceptable time. We have

both industry benchmarks and random problem instances.

in the stopping criteria. Because of this huge number of

values that seemed best. We tested each configuration on 27 2500

(IDEA, RC6, MARS) of different complexity. Concerning the

mark: communication dominated, in which the communication

View publication stats

You might also like

W E N O PSR Y - f (1)

J , $ & ' ) J , $ & '

NOQP - N OQP -f Wed J 30 $

K d N O PSR - N O P - N OPS R -/ NOQPS R 8-