0% found this document useful (0 votes)
57 views34 pages

Understanding Data Flow Testing Techniques

Data Flow Testing is a white-box testing technique that emphasizes the definition, usage, and termination of variables in a program, utilizing control flow graphs to identify potential data anomalies. The technique aims to ensure that all defined variables are properly used and to detect data-centric bugs, making it more robust than traditional statement or branch testing. Key concepts include defining and using variables, identifying definition-use paths, and employing various testing strategies to cover all possible paths in the program.

Uploaded by

feathh1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views34 pages

Understanding Data Flow Testing Techniques

Data Flow Testing is a white-box testing technique that emphasizes the definition, usage, and termination of variables in a program, utilizing control flow graphs to identify potential data anomalies. The technique aims to ensure that all defined variables are properly used and to detect data-centric bugs, making it more robust than traditional statement or branch testing. Key concepts include defining and using variables, identifying definition-use paths, and employing various testing strategies to cover all possible paths in the program.

Uploaded by

feathh1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

What is Data Flow Testing?

• Data Flow Testing is a white-box testing technique that focuses on


how data (variables) is defined, used, and killed in a program,
rather than just which statements or branches are executed.
• Data-flow Testing uses the control flow graph to explore the
unreasonable things that can happen to data
• Consideration of data-flow anomalies leads to test path selection
strategies that fill the gaps between complete path testing and
branch and statement testing.

Data-flow Testing is the name given to a family of test strategies based


on selecting paths through the program’s control flow in order to
explore sequences of events related to the status of data objects.

For example, pick enough paths to assure that every data object has been
initialized prior to use or that all defined objects have been used for something
What is Data Flow Testing?
The main points of concern are:
✓ Statements where variables receive values (definition).
✓ Statements where these values are used (referenced)

1) Data flow testing focuses on variable definition and variable


usage.
2) The variables are defined and used (referenced) throughout the
program.
3) This technique concentrates on how a variable is defined and
used at different places of the program.
Data Flow Testing
Define / Reference Anomalies:
i. A variable that is defined but never used (referenced).
ii. A variable that is used but never defined.
iii. A variable that is defined twice before it is used.
iv. A variable is used before even first-definition

We may
➢ define a variable,
➢ use a variable and
➢ redefine a variable.

Define / reference anomalies may be identified by static


analysis of the program i.e. analyzing program without executing
it.
Data Flow Testing – Key Idea

• Track data object states:


– d (define): when a variable is assigned or initialized.
– u (use): when a variable is read/used (in computation or
predicate).
– k (kill): when a variable is deallocated, goes out of scope, or is
overwritten.
• Construct definition-use (DU) paths in the control flow
graph.
• Select test cases that ensure every def-use pair is executed.
Data Flow Testing – Why Important

• Finds data-centric bugs like:


– Using a variable before initialization.
– Defining a variable but never using it.
– Redefining a variable before its use.
• Stronger than statement/branch testing because it tests the
correctness of data usage.
Definitions: Data Flow Testing
Definition (DEF)
• A variable is defined when it is initialized, declared, or
assigned a value.
Example
int x = 0; // definition of x
y = 5; // definition of y
➢ Also includes: file opened, object allocated, stack push.

❑ A node of a program graph is a defining node for a variable v,


if and only if, the value of the variable v is defined in the
statement corresponding to that node.
❑ It is represented as DEF (v, n) where v is the variable and n is
the node corresponding to the statement in which v is defined.
Definitions: Data Flow Testing
Usage (USE) : A variable is used when its value is accessed.
Two types of uses:
C-use (Computation use): variable used in a calculation or
assignment. For example : z = x + y; // x and y are C-uses
P-use (Predicate use): variable used in a condition or loop
control for example if (x > 0) // x is a P-use

✓ A node of a program graph is a usage node for a variable v , if and only if,
the value of the variable is used in the statement corresponding to that
node.
✓ It is represented as USE (v, n), where ‘v’ is the variable and ‘n’ is the node
corresponding to the statement in which ‘v’ is used.
✓ A usage node USE (v, n) is a Predicate use node (denoted as P-use), if and
only if, the statement corresponding to node ‘n’ is a predicate statement
otherwise USE (v, n) is a computation use node (denoted as C-use).
Definitions: Data Flow Testing

Definition-Use (DU) Path


• A path in the program’s control flow graph that starts with
a definition of a variable and ends with a use of that
variable.
• No re-definition occurs in between.

Definition-Clear (DC) Path


• A DU-path with no other definition of that variable along
the way.
• A definition clear path (denoted as dc-path) for a
variable ‘v ’ is a definition use path with initial and final
nodes DEF (v,m) and USE ( v, n) such that no other node
in the path is a defining node of variable ‘v’.
Definitions: Data Flow Testing

Kill (K)
• A variable is killed when its value is destroyed, deallocated,
goes out of scope, or becomes undefined.
Examples:
• Local variable after function ends.
• Freeing a pointer in C (𝑓𝑟𝑒𝑒(𝑝) → 𝑘𝑖𝑙𝑙𝑠(𝑝)
• Reassignment can be considered kill + redefine.
Definitions: Data Flow Testing

• The du-paths and dc-paths describe the flow of data across


program statements from statements where values are
defined to statements where the values are used.
• A du-path for a variable ‘v’ may have many redefinitions of
variable ‘v’ between initial node (DEF (v,m)) and final node
(USE (v,n)).
• A dc-path for a variable ‘v’ will not have any definition of
variable ‘v’ between initial node (DEF ((v,m)) and final node
(USE ((v,n))

➢ The du-paths that are not definition clear paths are potential
troublesome paths.
➢ They should be identified and tested on topmost priority.
Identification of du and dc Paths
The various steps for the identification of du and dc paths are
given as:
1) Draw the program graph of the program.
2) Find all variables of the program and prepare a table for
define / use status of all variables using the following format:

3) Generate all du-paths from define/use variable table using


the following format:

4) Identify those du-paths which are not dc-paths.


Testing Strategies Using du-Paths
i) Test all du-paths
• All du-paths generated for all variables are tested. This is
the strongest data flow testing strategy covering all
possible du-paths

ii) Test all uses


• Find at least one path from every definition of every
variable to every use of that variable which can be
reached by that definition.
• For every use of a variable, there is a path from the
definition of that variable to the use of that variable.

iii) Test all definitions


• Find paths from every definition of every variable to at
least one use of that variable;
Testing Strategies Using du-Paths
✓ The first strategy requires that each definition reaches all
possible uses through all possible du-paths,
✓ The second Strategy requires that each definition reaches
all possible uses, and
✓ The third requires that each definition reaches at least one
use.

We may choose any strategy for testing


✓ As we go from ‘test all du-paths’ (no. (i)) to ‘test all
definitions’ (no.(iii)), the number of paths are reduced.
✓ It is best to test all du-paths (no. (i)) and give priority to
those du-paths which are not definition clear paths.
Exercise: Identify du-path for all variables of the Program and
also Generate Test Cases
1. void main()
2. {
3. float A,B,C;
4. clrscr();
5. printf("Enter number 1:\n");
6. scanf("%f", &A);
7. printf("Enter number 2:\n");
8. scanf("%f", &B);
9. printf("Enter number 3:\n");
10. scanf("%f", &C);
/*Check for greatest of three numbers*/
11. if(A>B) {
12. if(A>C) {
13. printf("The largest number is: %f\n",A);
14. }
15. else {
16. printf("The largest number is: %f\n",C);
17. }
18. }
19. else {
20. if(C>B) {
21. printf("The largest number is: %f\n",C);
22. }
23. else {
24. printf("The largest number is: %f\n",B);
25. }
26. }
27. getch();
28. } Program to find the largest among three numbers and its Program Flow Graph `
Exercise: Identify du-path for all variables of the Program and
also Generate Test Cases
Step 1: Define /use nodes for all three variables ‘A’, ‘B’ and ‘C’

Step 2 : du-paths with beginning node


and end node Step 3: Test cases
Data Flow Testing
Example
1 Program Commission (INPUT, OUTPUT)
2 Dim locks, stocks, barrels As Integer
3 Dim locksPrice, stocksPrice, barrelsPrice As Real
4 Dim totallocks, totalstocks, totalbarrels As Integer
5 Dim locksales, stocksales, barrelsales As Real
6 Dim Sales, Commission As Real
7 lockPrice = 45.0
8 stockPrice = 30.0
9 barrelPrice = 25.0
10 totalLocks = 0
11 totalStocks = 0
12 totalBarrels = 0
13 Input (locks)
/*loop condition uses -1
14 while NOT (locks = -1)
to indicate end of data*/
15 Input (Stocks, barrles)
16 total locks = totalLocks + locks
17 totalStocks = total stocks + stocks
18 total barrels = total barrels + barrels
19 Input (locks)
20 End While
21 Output ("lockssold;", totallocks)
Data Flow Testing
Example
22 Output ("stockssold:", total stocks)
23 Output ("Barrels sold;", totalBarrels)
24 locksales = lockPrice * total locks
25 stocksales = stockPrice * totalstocks
26 barrelsales = barrelPrice * totalBarrels
27 Sales = locksales + stocksales + barrelSales
28 Output ("total Sales;", Sales)
29 If (sales > 1800.0)
30 Then
31 Commission = 0.10 * 1000.0
32 Commission = Commission + 0.15 * 800.0
33 Commission = Commission + 0.20 *(Sales - 1800.0)
34 Else If (Sales > 1000.0)
35 Then
36 Commission = 0.10 * 1000.0
37 Commission = Commission + 0.15 * (sales - 1000.0)
38 Else Commission = 0.10 * sales
39 Endif
40 Endif
41 Output ("Commission is", Commission
42 End Commission
Program Graph
7 8 9 10 11 12 13

14

15 16 17 18 19

20 21 22 23 24 25 26 27 28

29
30
34
35 38
31
36
32
37
33 39

40
41
42
DD(Decision to Decision) Path Graph
A
DD-Path Nodes
B A 7, 8, 9, 10, 11, 12, 13
B 14
C
C 15, 16, 17, 18, 19
D D 20,21, 22, 23, 24, 25, 26, 27, 28
E 29
E F 30, 31, 32, 33
F G G 34
H 35, 36, 37
H I I 38
J
J 39
K 40
K
L 41, 42
L
Define / Use Nodes for variables
Variable Defined at Node Used at Node
lockPrice 7 24
stockPrice 8 25
barrelPrice 9 26
totallocks 10, 16 16, 21, 24
totalstocks 11, 17 17, 22, 25
totalBarrels 12, 18 18, 23, 26
locks 13, 19 14, 16
stocks 15 17
barrels 15 18
locksales 24 27
stocksales 25 27
barrelsales 26 27
sales 27 28, 29, 33, 34, 37, 38
commission 31, 32, 33, 36, 37, 38 32, 33, 37, 41
Selected Define / Use Paths
Path (beginning, end)
Variable Nodes Definition Clear ?
lockPrice 7, 24 Yes
stockPrice 8, 25 Yes
barrelPrice 9, 26 Yes
totalstocks 11, 17 Yes
totalstocks 11, 22 No
totalstocks 11, 25 No
totalstocks 17, 17 Yes
totalstocks 17, 22 No
totalstocks 17, 25 No
locks 13, 14 Yes
locks 19, 14 Yes
locks 13, 16 Yes
locks 19, 16 Yes
sales 27, 28 Yes
sales 27, 29 Yes
sales 27, 33 Yes
sales 27, 34 Yes
sales 27, 37 Yes
sales 27, 38 Yes
Derive

• du-path for stocks


• du -paths for locks
• du - paths for totallocks.
• du-Paths for Sales
• du path for commission
du-path for stocks
Path <15, 17>
du - Paths for the Locks
DEF. (locks, 13)
DEF (locks, 19)
USE (Locks, 14)
USE (locks, 16)
These yield four du – paths
P1 = <13, 14>
P2 = <13, 14, 15, 16>
P3 = <19, 20, 14>
P4 = <19, 20, 14, 15, 16>
Du - Paths for the Locks
If we extend paths P1 and P3 to include node 21
then
P1= <13, 14, 21>
P3  = <19, 20, 14, 21>
P1 , P2, P3  and P4 form a very complete set of
test cases for the while loop
- by pass the loop
- begin the loop
- repeat the loop
- exit the loop
All these due paths are definition clear
du – paths that are not definition clear are
potential trouble spots
Du - Paths for the totalLocks
DEF (totalLocks, 10)
DEF (totalLocks, 16)
USE (totalLocks, 16)
USE (totalLocks, 21)
USE (totalLocks, 24)
We might expect six du-paths
P5 = <10, 11, 12, 13, 14, 15, 16> (Definition clear)
P6 = <10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21>
(Not Definition Clear)
Path P6 ignores the possible repetition of while –
loop. We could highlight this by noting the sub path
<16, 17, 18, 19, 20, 14, 15> might be traversed
several times.
Du - Paths for the totalLocks
P7 = <10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 ,21,
22, 23, 24>
P7 = (P6, 22, 23, 24) Not Definition clear)
<16, 16> (We may ignore path P8)
P8 = <16, 17, 18, 19, 20, 21)
P9 = <16, 17, 18, 19, 20, 21, 22, 23, 24>
P8 & P9 are definition clear and have loop iteration
problem.
Du - Paths for the Sales
DEF (Sales, 27)
USE (Sales, 28)
USE (Sales, 29)
USE (Sales, 33)
USE (Sales, 34)
USE (Sales, 37)
USE (Sales, 38)
Only one defining node is use for sales, therefore, all
the du - paths with respect to sales must be definition
clear.
du paths for the sales are interesting because they
illustrate, predicate and computation uses.
Du - Paths for the Sales
P10 = <27, 28>
P11 = <27, 28, 29>
P12 = <27, 28, 29, 30, 31, 32, 33>
Two choices for du paths begin with path P11.
•The static choice is path <27, 28, 29, 30, 31, 32, 33,
34>.
•The dynamic choice is the path <27, 28, 29, 34>
So the remaining du paths for sales are
P13 = <27, 28, 29, 34>
P14 = <27, 28, 29, 34, 35, 36, 37>
P15 = <27, 28, 29, 38>
Du - Paths for Commissions
Variable Nodes Feasible? Definition Clear ?
Commission 31, 32, Yes Yes
Commission 31, 33 Yes No
Commission 31, 37 No N/A
Commission 31, 41 Yes No
Commission 32, 32 Yes Yes
Commission 32, 33 Yes Yes
Commission 32, 37 No N/A
Commission 32, 41 Yes No
Commission 33, 32 No N/A
Commission 33, 33 Yes Yes
Commission 33, 37 No N/A
Commission 33, 41 Yes Yes
Commission 36, 32 No N/A
Commission 36, 33 No N/A
Commission 36, 37 Yes Yes
Commission 36, 41 Yes No
Commission 37, 32 No N/A
Commission 37, 33 No N/A
Commission 37, 37 Yes Yes
Commission 37, 41 Yes Yes
Commission 38, 32 No N/A
Commission 38, 33 No N/A
Commission 38, 37 No N/A
Commission 38, 41 Yes Yes
du paths for commission
The statements 29 through 41, the calculation of
commission is controlled by ranges of the variable
sales. Statement 31 to 33 build up the value of
commission by using the memory location to hold
intermediate values. This is the common
programming practice.
Therefore, du paths that begin with the three real
defining nodes are:
DEF (Commission, 33)
DEF (Commission, 37)
DEF (Commission, 38)
Only one usage nodes is used
USE (Commission, 41)
Conclusion- Data Flow Testing

2) Improves Reliability of Programs


• Critical in domains where data correctness matters more
than path correctness:
– Finance (bank transactions).
– Medical systems (patient data).
– Aerospace/embedded systems (sensor values).

3) Provides Stronger Test Coverage


• Goes beyond statement/branch coverage → ensures all
definition-use pairs are tested.
• More likely to reveal hidden bugs that appear only with
specific variable states.
Conclusion- Data Flow Testing

4) Helps in Early Error Detection


• Detects anomalies during development/testing before they
become runtime failures.
• Reduces debugging effort and saves cost in the software
lifecycle.

5) Complements Other Testing Strategies


• Bridges the gap between path testing (too exhaustive) and
statement testing (too weak).
• Provides a balanced, systematic approach to ensure data is
handled correctly.
Assignment
Read anyone of the following paper, re-write in your own
words and submit the paper alongwith presentation latest by.
September 30, 2025 on eLMS
1. Rapps, S. and Weyuker, EJ, Selecting software test data using
data flow information, IEEE, Transactions on Software
Engineering vol. SE-11, No. 4, PP 367 – 375, April 1985
2. Clarke, Lori A, et al, A formal evaluation of data flow path
selection criteria IEEE, Transactions on Software
Engineering Vol SE-15, No. 11, PP 1318-1332 No. 1989
3. Data Flow Testing by Ms. Neelam Gupta and Mr. Rajiv
Gupta, the University of Arizona, Tucson, Arizona

You might also like