Clock-Gating and Its Application to Low Power Design of Sequential Circuits
Qing WU
Department of Electrical Engineering-Systems, University of Southern California
Los Angeles, CA 90089, USA, Phone: (213)740-4480
Massoud PEDRAM
Department of Electrical Engineering-Systems, University of Southern California
Los Angeles, CA 90089, USA, Phone: (213)740-4458
Xunwei WU
Department of Electronic Engineering, Hangzhou University
Hangzhou, Zhejiang 310028, CHINA
ABSTRACT
This paper models the clock behavior in a sequential circuit by a quaternary variable and uses this representation
to propose and analyze two clock-gating techniques. It then uses the covering relationship between the triggering
transition of the clock and the active cycles of various flip-flops to generate a derived clock for each flip-flop in
the circuit. Design examples using gated clocks are provided next. Experimental results show that these designs
have ideal logic functionality with lower power dissipation compared to traditional designs.
Clock-Gating and Its Application to Low Power Design of Sequential Circuits
I.
I
NTRODUCTION
The sequential circuits in a system are considered major contributors to the power dissipation since one input of
sequential circuits is the clock, which is the only signal that switches all the time. In addition, the clock signal
tends to be highly loaded. To distribute the clock and control the clock skew, one needs to construct a clock
network (often a clock tree) with clock buffers. All of this adds to the capacitance of the clock net. Recent studies
indicate that the clock signals in digital computers consume a large (15% - 45%) percentage of the system power
(1). Thus, the circuit power can be greatly reduced by reducing the clock power dissipation.
Most efforts for clock power reduction have focused on issues such as reduced voltage swings, buffer insertion
and clock routing (2). In many cases switching of the clock causes a lot of unnecessary gate activity. For that
reason, circuits are being developed with controllable clocks. This means that from the master clock other clocks
are derived which, based on certain conditions, can be slowed down or stopped completely with respect to the
master clock. Obviously, this scheme results in power savings due to the following factors:
1) Load on the master clock is reduced and the number of required buffers in the clock tree is decreased.
Therefore, the power dissipation of clock tree can be reduced.
2) The flip-flop receiving the derived clock is not triggered in idle cycles; the corresponding dynamic power
dissipation is thus saved.
3) The excitation function of the flip-flop triggered by derived clock may be simplified since it has a don’t care
condition in the cycle when the flip-flop is not triggered by the derived clock.
In (3) the authors presented a technique for saving power in the clock tree by stopping the clock fed into idle
modules. However, a number of engineering issues related to the design of the clock tree were not addressed and
hence, the proposed approach has not been adopted in practice.
This paper investigates various issues in deriving a gated clock from a master clock. In section II, a quaternary
variable is used to model the clock behavior and to discuss its triggering action on flip-flops. Based on this
analysis, two clock-gating schemes are proposed. In section III, we use the covering relation between the clock
and the transition behaviors of the triggered flip-flops to derive conditions for gating the master clock. Two
common sequential circuits, i.e. 8421 BCD code up-counter and three-excess counter, are then described to
illustrate the procedure for finding a derived clock. In section IV, a new technique for clock-gating is presented
which generates a clock synchronous with the master clock. This eliminates the additional skew between the
master clock and the derived clock. Thus, the designed sequential circuit is a synchronous one. Finally, we
present circuit simulation results to prove the quality of the derived clock and its ability to reduce power
dissipation in the circuit.
II.
D
ESCRIPTION FOR CLOCK BEHAVIOR AND CLOCK-GATING
In a synchronous system, a flip-flop is triggered by a certain directional transition of a clock signal. For the clock
to be another signal rather than the master clock, it must offer the same directional transition to trigger the flip-
flop, and it must be “in step” with the master clock.
For the clock signal
clk
in a circuit if we denote its logic values before and after a transition as
clk(t)
and
clk
+
(t)
respectively, four combinations can be used to express different behaviors of the clock as shown in Table 1,
where a special quaternary variable
clk
denotes
~
the corresponding behavior. The four values are (0,
α, β,
1),
where
α, β
represent two kinds of transition behaviors and 0, 1 represent two kinds of holding behaviors. (Note
that although they have the same forms as signal values 0 and 1, their meanings are different.)
Table 1 QUATERNARY REPRESENTATION FOR BEHAVIORS OF A SIGNAL
~
clk
clk
(
t
)
→
clk
+
(
t
)
Behavior
0-holding
α-transition
β-transition
1-holding
0
α
β
1
0
0
1
1
0
1
0
1
In addition, we can also define a
literal
operation to identify the behavior of a clock:
⎧
⎪
1
clk
b
= ⎨
⎪
⎩
0
~
if clk =
b
~
if clk
≠
b
,
(1)
where
b
∈
{0,
α
,
β
, 1}
. Thus, the rising transition
clk
α
and the falling transition
clk
β
of a clock are binary variables
and can serve as arguments of Boolean operations. For example, from Table 1 we have
clk
0
=
clk
⋅
clk
+
,
clk
α
=
clk
⋅
clk
+
,
clk
β
=
clk
⋅
clk
+
and
clk
1
=
clk
⋅
clk
+
.
Assume that there are
n
flip-flops in a sequential circuit and that their outputs and clock inputs are denoted by
Q
i
and
clk
i
,
i
= 0,1, ,n-1, respectively. For a synchronous sequential circuit, we have
clk
i
=
clk,
namely all flip-
flops are triggered by the same master clock signal
clk.
However, if a flip-flop
Q
i
is to be disconnected from the
master clock during some (idle) cycles, then we have to use a derived clock for
Q
i
. Notice that this derived clock
should be “in step” with the master clock for the circuits to remain synchronous.
Generally, we consider that the derived clock is obtained from the master clock
clk
and the outputs of other flip-
flops
Q
0
, ,
Q
i
−
1
,
Q
i
+
1
, ,
Q
n
−
1
, (which make transitions following the triggering transition of their respective
clocks.) Since both AND gating and OR gating can be used for controlling the master clock, we have the
following two clock-gating forms
clk
i
=
g
i
+
p
i
⋅
clk
,
clk
i
=
g
i
⋅
(
p
i
+
clk
)
,
«
(2)
(3)
where
g
i
and
p
i
are functions of flip-flop outputs
Q
0
, ,
Q
i
−
1
,
Q
i
+
1
, ,
Q
n
−
1
.
Consider a flip-flop triggered by the falling clock transition as an example (i.e. a negative edge-triggered flip-
flop). The timing relationships of
clk, p
i
,
p
i
⋅
clk
and
p
i
+
clk
are shown in Fig.1. Note that
p
i
exhibits a delay with
respect to the falling transition of clock, may have glitches (represented by vertical grid lines), and has its final
stable value in the zone where
clk
= 0. We can see that
p
i
+
clk
cannot prevent the glitches and may even lead to
an extra glitch. Therefore, only (2) is suitable for the negative edge-triggered flip-flops while (3) is not. Note that
g
i
in (2) must be glitch-free when
clk
= 0.
The above discussion shows that the falling transition of
clk
i
in (2) occurs for the following two cases:
(1) When
g
i
= 0 and
p
i
= 1, falling transition of
clk
leads to falling transition of the derived clock
clk
i
. Therefore,
p
i
may be named the
transition propagate
term.
(2) When
g
i
= 1 and
g
i
makes a falling transition, the derived clock
clk
i
makes a falling transition since
clk
and
hence
pi
⋅
clk
are 0 at that time instance. Therefore,
g
i
may be named the
transition generate
term.
Figure 1
Timing relationship of
clk, p
i
(g
i
),
p
i
⋅
clk and
p
i
+clk
From this analysis, we obtain
clk
i
β
=
g
i
β
+
g
i
⋅
p
i
⋅
clk
β
.
(4)
Similarly, we can show that the derived clock signal in (3) is suitable for the flip-flops triggered by the rising
transition of the clock. Here
g
i
in (3) must be glitch-free when
clk
= 1. The rising transition of
clk
i
can be
expressed as
α
α
clk i
=
gi
+
gi
⋅
pi
⋅
clk
α
.
(5)
It should be pointed out that the attached circuitry needed for generating the derived clock should be simple to
avoid excessive power dissipation due to this overhead circuitry. Therefore
g
i
and
p
i
in (2) and (3) should be
relatively simple functions. Especially, we require
g
i
to be simple to avoid dangerous glitches. Note that if
g
i
= 0,
p
i
= 1 in (2) or
g
i
= 1,
p
i
= 0 in (3), we return to the condition of applying the master clock
clk
in a synchronous
sequential circuit.
III.
D
ESIGN OF SEQUENTIAL CIRCUITS BASED ON DERIVED CLOCK
Assume that the derived clock for the flip-flop
Q
i
is
clk
i
. Falling transitions of
clk
i
have to cover all cycles when
the flip-flop
Q
i
makes transitions,
Q
i
α
and
Q
i
β
. The covering relation can be expressed as:
clk
i
β
≥
(
Q
i
α
+
Q
i
β
)
.
(6)
Since AND and OR operations on Boolean variables can be interpreted as minimum and maximum operations on
these variables, i.e.
x
⋅
y
=
min(
x
,
y
)
and
x
+
y
=
max(
x
,
y
)
, we can obtain the following equations from (6)
clk
i
β
⋅
(
Q
i
α
+
Q
i
β
)
=
(
Q
i
α
+
Q
i
β
)
,
(7)
(8)
clk
i
β
+
(
Q
i
α
+
Q
i
β
)
=
clk
i
β
.
Therefore, we should first obtain (
Q
i
α
+
Q
i
β
) and then generate the derived clock
clk
i
for flip-flop
Q
i
. We will
show the procedure by using design examples.
Example 1. Design of an 8421 BCD code up-counter
The next states and state behaviors of an 8421 BCD code up-counter are shown in Table 2, where behavior of
each flip-flop (
Q
i
→
Q
i
+
) is denoted by
Q
i
. From Table 2, the corresponding next state Karnaugh maps and
behavior Karnaugh maps may be obtained, as shown in Fig.2(a) and 2(b). In these maps an empty box represents
~
评论