0% found this document useful (0 votes)
11 views4 pages

FPGA-Based Parallel LDPC Decoder Design

This document presents the design of a parallel low density parity check (LDPC) decoder architecture using a min-sum algorithm. The decoder is implemented on a Xilinx FPGA for high-speed communications. Check node and bit node processors are used to iteratively calculate messages based on the min-sum approximation of the sum-product algorithm. This allows for soft-decision decoding of regular LDPC codes with high error correction and reduced complexity compared to other decoding methods.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views4 pages

FPGA-Based Parallel LDPC Decoder Design

This document presents the design of a parallel low density parity check (LDPC) decoder architecture using a min-sum algorithm. The decoder is implemented on a Xilinx FPGA for high-speed communications. Check node and bit node processors are used to iteratively calculate messages based on the min-sum approximation of the sum-product algorithm. This allows for soft-decision decoding of regular LDPC codes with high error correction and reduced complexity compared to other decoding methods.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

AbstractThis paper presents low complexity high speed decod-

er architecture for a high speed communication system. To get


these two advantages a parallel architecture using Low Density
Parity Check Code (LDPC) is proposed. Recently, LDPC codes
have attracted much attention due to their excellent error cor-
rection capability. The sparseness of LDPC provides a decoder
structure with less complexity. The parallel architecture is used
to increase the speed of the decoder. In this paper, implementa-
tion of the decoder is based on belief propagation algorithm
known as min sum algorithm. Based on the proposed architec-
ture a regular LDPC decoder is implemented on Xilinx Field
Programmable Gate Array (FPGA) vertex IV kit using Very
High speed Integrated Circuit Description Language (VHDL).

Keywords: Field-programmable gate arrays (FPGAs), low-density
parity-check (LDPC) codes, and parallel implementation.

I. INTRODUCTION
Low-Density-Parity-Check (LDPC) Codes are a form of
forward error correction codes, discovered by Gallager in
1962. They have high error correction capability and achieve
rates very close to Shannon limit. LDPC is a block code
whose parity-check matrix (H) contains very few number of
ones. The sparseness of H matrix guarantees low decoding
complexity and hardware flexibility. They are decoded itera-
tively using the graphical properties of the H matrix. LDPC
codes can be represented in two ways. One is matrix repre-
sentation and the other is graphical representation. Graphical
representation makes use of Tanner graph, which consists of
two classes of nodes known as bit nodes and check nodes.
Columns in the H matrix are represented by bit nodes and
rows are represented by the check nodes. Check node j is
connected to bit node i whenever element h
ji
in H is a one
[6]. An LDPC parity-check matrix is considered regular if
the H matrix has constant column degree and row degree.
But in irregular LDPC code, column and row degrees need
not be constant. It depends on the degree distribution polyno-
mials which give better error performance than regular
LDPC at the cost of hardware decoding complexity. Imple-
mentation of LDPC decoders can be done in two ways: Hard
-decision decoding and Soft-decision decoding. In hard deci-
sion decoding, even though circuit complexity is less, net
coding gain is also less. But in soft decision decoding, better
coding gain can be achieved than hard decision decoding
with a trade-off in circuit complexity. In this paper, to
achieve good error performance and less circuit complexity,
soft decision decoding for regular LDPC code is considered.
The decoder is based on a belief propagation algorithm. To
reduce hardware complexity, one of the belief propagation
algorithms known as min sum algorithm is explored in this
paper. Generally, LDPC decoder implementation falls into
three categories: serial, fully parallel and partially parallel
structures. Serial decoders are very slow as their whole de-
coding operation is done by only one processor. So this is
not suitable for a high speed communication system. Fully
parallel decoders directly instantiate the tanner graph of the
LDPC code to the hardware. In this, each individual variable
node or check node is implemented as a processor and all the
units are connected through an interconnection network re-
flecting the tanner graph connectivity. It is clear that such
fully parallel decoders can achieve very high decoding
throughput. Partially parallel LDPC decoder targets on ap-
propriate trade-off between hardware complexity and decod-
ing speed. Certain number of bit nodes or check nodes is
mapped to a single decoding unit. So decoding speed is less
than fully parallel structure but routing complexity can be
reduced by using this structure. In this paper, we present the
implementation of a regular, parallel LDPC decoder using
min sum algorithm. The remaining part of the paper is orga-
nized as follows. In Section II, a overview of the min sum
algorithm is given. Section III gives details of proposed ar-
chitecture of the LDPC decoder based on min sum algo-
rithm. Section IV provides the implementation results and
discussions of the LDPC decoder. Final concluding remarks
are given in Section V.

II. MIN SUM ALGORITHM
Among the iterative decoding algorithms of LDPC codes,
Sum Product algorithm provides good error correcting per-
formance. But this algorithm contains some hyperbolic func-
tions which are not easy to implement. Min-sum (MS) algo-
rithm approximates sum product algorithm with possibility
of easy hardware implementation.

A) Sum Product Algorithm (SPA):
This algorithm can be expressed in two forms. One in proba-
bility domain and the other in log domain. The implementa-
tion of SPA in log domain is easier compared to SPA in
probability domain as it has less complexity compared to
SPA in probability domain. So to implement an LDPC de-
coder with less hardware complexity SPA in log domain is
taken into account, here. Information from the encoder is
passed through the channel after BPSK modulation. This
will be corrupted by the noise in the channel and the re-
ceived symbol is,
y
i
= x
i
+ n
i
(1)
where x
i
= +1 or -1 with equal probability, and n
i
is the zero
mean Gaussian noise with variance
2
.

The algorithm is as follows:
The following notations are used in the algorithm: V
j
repre-
sents variable nodes connected to j
th
check node. V
j\i
denotes
variable nodes connected to j
th
check node except the i
th
vari-
able node. C
i
are the check nodes connected to i
th
variable
node and C
i\j
are the check nodes connected to variable node
FPGA Implementation of Regular Parallel LDPC Decoder
using Min Sum Algorithm

1
Sruthi. K,
2
Deepthy P.P
1,2
Department of Electronics and Communication Engineering,
National Institute of Technology Calicut, Calicut, Kerala-673 601, India
e-mail: sruthiudaykrishna@[Link]
1
, deepthi@[Link]
2

Volume 3, Issue ICRASE13, May 2013, ISSN Online: 2277-2677 201
International Journal of
Systems , Algorithms &
Applications
I II I
J JJ J
S SS S
A AA A A AA A


i except check node j. q
ij
(b) is the message passed from node
v
i
to node c
j
; b{0, 1}. Similarly, r
ji
(b) is the message passed
from node c
j
to node v
i
.

i) Initialization:
For each i{0,.n-1}, for each jC(i)



L(q
ij
), L(c
i
) are known as Log Likely Hood Ratios (LLRs)
defined as,






ii) Message Update:
1. Horizontal Step: For each j{0,.,n-k-1}, for each iV(j)











2. Vertical Step: For each i{0,., n-1}, for each iC(i),


3. Stopping Criterion Test: For each i{0,...n-1}, for each jC
(i),





Iterations are stopped if H
T
= 0 or if the number of iterations
reaches a fixed maximum value.

iii) Min Sum (MS) Algorithm:
Modifying the equation (6) in SPA gives min sum algorithm
[13].






III. PROPOSED ARCHITECTURE OF LDPC DECODER
In this section we present our proposed LDPC decoder archi-
tecture based on min sum algorithm mentioned in section II.
Here, each node of the tanner graph is considered as a proces-
sor and decoding is done in parallel. Figure.1 gives the pro-
posed LDPC architecture. In this, H matrix is stored in a
memory. Information from the channel, L(c
i
) are received by
the decoder as inputs and stored in another memory. These
information and H matrix are sent to the row scanning module
which helps us to build tanner graph.
























Figure.1. Block diagram of proposed LDPC Decoder.

It scans each row and find out the positions of 1s in that par-
ticular row. Then it accesses the L(c
i
) values from the input
memory according to the positions of 1s from the H matrix
memory. The controller output becomes high once it com-
pletes accessing the inputs from the input memory. Those L
(c
i
) values are the inputs of the check node processor. Here
equation (12) is computed and the L(r
ji
) values are given at the
output. All these values are given out parallel from the check
node processor and stored in a memory. The bit node proces-
sor takes these values from the memory and computes the
equations (8), (9), (10) and gives the code word as the output.
All processors and modules are controlled by the same clock
and that is the main advantage of this architecture.

A) Check Node Processor Architecture:
Check node processors are designed for the implementation of
equation (12) in the min sum algorithm. This is done by the
architecture shown in fig.2. L(c
i
) values are the output of the
row scanning module. These are given to the check node pro-
cessor and the sign and magnitude of these values are separat-
ed. According to the min sum algorithm, the L(r
ji
) values are
calculated by taking the product of signs and minimum mag-
nitude of the inputs of all variable nodes connected with j
th

check node except the i
th
variable node. Minimum module is
designed to get the minimum magnitude and XOR module is
designed to get the product of sign [1]. To get L(r
ji
) values, the
MSB bit of the output from the XOR module is checked. If it
is 1, the 2s complement of magnitude minimum values is
taken and if it is 0, magnitude itself is taken as the minimum
value. This gives the output of the check node processor. The
number of outputs from each check node processor depends
on the number of variable nodes connected with that check
node in the tanner graph.

i) XOR Module:
This is used for the implementation of product of sign term in
equation (12). XOR gates are used in this module [1]. The
detailed structure of implementation of sign terms for one
check node is as shown in figure.3. This gives the product of
all alpha values connected with check node except that partic-
ular variable node. This circuit is used in parallel for all check
nodes.
FPGA Implementation of Regular Parallel LDPC Decoder
using Min Sum Algorithm
Volume 3, Issue ICRASE13, May 2013, ISSN Online: 2277-2677 202
i i
i
i i
ij
ij
ij
pr(x =+1| y )
L(c ) =log (3)
pr(x = -1| y )
q (0)
L(q ) =log (4)
q (1)
ij ij ij ij
ji i V(j)\i i j i j
i V(j)\i
x
x
ji
ji
ji
= sign(m ), =| m | (5)
L(r ) =( ). ( ( ) ) (6)
where,
e +1 x
(x)=ln = -lntanh (7)
2 e - 1
r (0)
L(r ) =log
r (1)



| |
| |
| |
\
\

i
ij i j C \ j j i
L(q )= L(c )+ L(r ) (8)

i
i i j C ji
i
i
L(Q ) = L(c )+ L(r ) (9)
1 if L(Q ) 0
x = (10)
0 else

i j i V(j)\i i j
i V(j)\i
i V(j)\i i j
ji i V(j)\i i j i V(j)\i i j
( ( ) ) ( (min ))
= min (11)
L(r ) = .min (12)

2 ij i i ij
2
L(q ) L(c ) = y = m (2)

=
clk
input
memory
H matrix
memory
Row
scanning
module
con-
troll
er
Che
ck
node
pro-
cess
or
m
e
m
o
r
y
Bit
node
pro-
cesso
r
c
o
d
e
w
o
r
d
International Journal of
Systems , Algorithms &
Applications
I II I
J JJ J
S SS S
A AA A A AA A


Figure.2. Block diagram of check node processor.


Figure.3. circuit diagram of XOR module.

ii) Minimum Module:
























Figure.4. Block diagram of minimum module.
The figure.4 shows the internal architecture of the minimum
module in one check node processor. This circuit calculates
the magnitude (beta) minimum for j
th
check node with i
th
vari-
able node by taking magnitude values from all variable nodes
connected with j
th
check node except i
th
variable node [7]. All
beta minimum values are updated in parallel.

B) Variable Node Processor Architecture :
L(r
ji
) values from the check node processor are sent back to
the variable nodes connected with that particular check node.
At the variable node processor, these values are taken as the
inputs and stored in memories. The number of memory loca-
tions needed for a memory depends upon the number of varia-
ble nodes connected with each check node. To compute L(q
ij
)
messages from j
th
check node to i
th
variable node by using
equation (8), L(r
ji
) values from all the check nodes connected
with the i
th
variable node except the j
th
check node are added
together. These values are the input of check nodes for the
next iterations. Finally, the LLR values are added with the
final updated L(q
ij
) values. At the adder output, the stopping
criterion equation (10) is checked. If the MSB of the adder
output is 1, then L(Q
i
) is taken as 1. If it is 0, then L(Q
i
) is
taken as 0. These bits are concatenated to get the code word
output.
Figure.5. Block diagram of bit node processor.

IV. IMPLEMENTATION RESULT
A (2, 4) regular LDPC code with code rate of 1/2 is imple-
mented using min sum algorithm. Both variable node and
check node processors in this design are parallel. It takes 34
clock cycles to perform one iteration for a code length of 8.
The hardware utilization of the algorithm is shown in table1.
Here, after place and route, the report shows that maximum
frequency is 126.366MHz. This decoder is modelled in
VHDL, simulated and synthesized targeting on the vertex IV
FPGA Implementation of Regular Parallel LDPC Decoder
using Min Sum Algorithm
Volume 3, Issue ICRASE13, May 2013, ISSN Online: 2277-2677 203
2min
0
1
2
3
Com-
parator
Com-
parator
Com-
parator
Com-
parator
1min
0min
3min
International Journal of
Systems , Algorithms &
Applications
I II I
J JJ J
S SS S
A AA A A AA A


kit. This reconfigurable architecture can be easily scaled to
larger block lengths.

Table 1: Device Utilization Summary


V. CONCLUSION
A low complexity high speed LDPC decoder has been de-
signed and implemented. To reduce the hardware complexity
and to get high speed, regular parallel LDPC decoder is used
in this design. Here min sum algorithm is used for the imple-
mentation to reduce computations.

REFERENCES
[1] Ruwan N.S. Ratnayake, Erich [Link], Gu-Yeon Wei, A Bit
-node Centric Architecture for Low-Density Parity-Check De-
coders, in IEEE GLOBECOM 2007 Proceeding.
[2] Zhou Zhong,Yunzhou Li,Hanying Hu, Jing Wang, Modified
Min-Sum Decoding Algorithm for LDPC codes Based on Clas-
sified Correction, Third international conference on communi-
cation and network in china,2008
[3] Tong Zhang, Keshab K. Parhi,An FPGA Implementation of (3,
6)-Regular Low-Density Parity-Check Code Decoder, EURA-
SIP Journal on Applied Signal Processing 2003:6, 530542,
2003 Hindawi Publishing Corporation
[4] Thirupathi M, Ganesan S,VLSI implememtation of Very High
Speed LDPC Decoder, International Journal of Computer
Communication and Information System(IJCCIS)-
[Link]-[Link]-dec2010
[5] Damian [Link], Graciela Corral-Brions,Parallel architecture
for Decoding LDPC codes on High Speed Communication Sys-
tem,Proceedings of the Argentine school of micro nano elec-
tronics, Technology and Applications 2008
[6] Andrew [Link], Chris [Link],A690-mW 1-Gb/s 1024-
b,Rate-1/2 Low Density Parity Check code Decoder, IEEE
Journal of Solid State Circuit,Vol.37,No.3 March 2002
[7] Hang Jiang1, Chun Xu2, Qin Zhong3, and Guifeng Zhong3,A
Kind of Low Complexity LDPC Decoder, ISBN 978-952-5726
-07-7 (Print), 978-952-5726-08-4 (CD-ROM)Proceedings of the
Second Symposium International Computer Science and Com-
putational Technology(ISCSCT 09) Huangshan, P. R. China,
26-28,Dec. 2009, pp. 102-105
[8] Susmitha Remmanapudi, Balaji Bandaru, An FPGA implemen-
tation of Low Density Parity Check Codes Construction and
Decoding in devices, circuits and systems(ICDCS) internation-
al conference on 15-16 march 2012
[9] Dale [Link],LDPC Code Construction with Flexible Hard-
ware Implementation IEEE International Conference on Com-
munication, 2003,ICC,03
[10] [Link],Angus Wu,VLSI implementation for Low Density
Parity Check Decoder, International Conference on Environ-
mental and computer Science,ICECS 2001
[11] Edward Liao, Engling Yeo, Borivoje Nikolic,Low-Density
Parity-Check Code Construction for Harware Implementa-
tion ISBN: 0-7803-8533-0 In proceeding of IEEE Internation-
al Conference on Communications, Volume: 5, 2004
[12] Engling Yeo, Borivoje Nikolic,Venkat Anantha-
ram,Architectures and Implementations of Low-Density Parity-
Check Decoding Algorithms ISBN: 0-7803-7523-8 In pro-
ceeding of IEEE International Conference on Communications,
2002
[13] A. B. F. Zarkeshvari, On implementation of min-sum algo-
rithm for decoding low-density parity-check (LDPC) cpdes, in
IEEE Globecom conference, 2002.
Resources Total Used
Utilization
ratio
Slices 6144 2065 33%
Slice Flip Flops 12288 1012 8%
4 input LUTs 12288 4050 32%
Number of
GCLKs
32 1 3%
FPGA Implementation of Regular Parallel LDPC Decoder
using Min Sum Algorithm
Volume 3, Issue ICRASE13, May 2013, ISSN Online: 2277-2677 204
International Journal of
Systems , Algorithms &
Applications
I II I
J JJ J
S SS S
A AA A A AA A

You might also like