0% found this document useful (0 votes)
110 views18 pages

Java Regular Expressions Overview

Regular expressions are patterns used to match character combinations in strings. They are useful for tasks like validation, search/replace, parsing text. The document discusses regular expression concepts like Pattern and Matcher classes, character classes, quantifiers, and split() method. Pattern objects compile an expression, while Matcher objects use the pattern to search strings and return details of matches. Character classes define groups of characters to match, and quantifiers specify match counts. The split() method divides a string using the regular expression pattern.

Uploaded by

Lalit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views18 pages

Java Regular Expressions Overview

Regular expressions are patterns used to match character combinations in strings. They are useful for tasks like validation, search/replace, parsing text. The document discusses regular expression concepts like Pattern and Matcher classes, character classes, quantifiers, and split() method. Pattern objects compile an expression, while Matcher objects use the pattern to search strings and return details of matches. Character classes define groups of characters to match, and quantifiers specify match counts. The split() method divides a string using the regular expression pattern.

Uploaded by

Lalit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
  • Regular Expressions Introduction
  • Pattern and Matcher Classes
  • Character Classes
  • Quantifiers
  • Pattern Splitting
  • String Tokenizer
  • Application Examples

Regular Expression

 If we want to represent a group of strings according to a


particular pattern then we should go for regular expression.

Example1:

We can write a regular expression to represent all valid mobile


numbers.

We can write a regular expression to represent all mail id's.

 The main important application areas of regular expression are


1. To develop validation frameworks.
2. To develop pattern matching applications (Ctrl + F in windows and
grip command in unix).
3. To develop translators like assemblers, compilers, interpreters
etc.
4. To develop digital circuits.
5. To develop communication protocols like TCP/IP, UDP etc.

package RegularExcpression;

import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {


int count = 0;

Pattern p = [Link]("ab");
Matcher m = [Link]("abbabbba"); // Mathcer class present
in Pattern class

while([Link]()) {
count++;
[Link]([Link]()); //start index
}
[Link]("The total number of occurance is: " +
count);
}

Output:

0
3
The total number of occurance is: 2

package RegularExcpression;

import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {


int count = 0;

Pattern p = [Link]("ab");
Matcher m = [Link]("abbabbba"); // Mathcer class present
in Pattern class

while([Link]()) {
count++;
[Link]([Link]() + " " + [Link]() + " " +
[Link]()); //start index, end index and which group is matched(ab)
}

[Link]("The total number of occurance is: " +


count);
}

}
Output:

0 2 ab
3 5 ab
The total number of occurance is: 2

Pattern:
 A pattern object is a compiled version of regular expression, i.e
it is a java equivalent object of pattern.
 We can create a pattern object by using compile() method of
pattern class.

Public static Pattern compile(String re);

Pattern p = [Link]("ab");

Matcher:
 We can use Matcher object to check the given pattern in the
target String.
 We can create a Matcher object by using matcher() method of
pattern class.

Public Matcher matcher(String target);

Matcher m = [Link]("abbabbba");

Important methods of Matcher class:

 boolean find();---it attempts to find next match and returns


true, if it is available.
 int start();----return start index of match.
 int end();----return end+1 index of the match.
 String group();---it returns the matched pattern.

Note: Pattern and Matcher classes present in [Link] package


and introduced in java 1.4 version.
Character classes:
 [abc] either 'a' or 'b' or 'c'

 [^abc] except 'a' and 'b' and 'c'

 [a-z] any lower case alphabet symbol


from a to z.

 [A-Z] any upper case alphabet symbol


from A to Z.

 [a-zA-Z] any alphabet symbol.

 [0-9] any digit from 0 to 9.

 [0-9a-zA-Z] any alphanumeric symbol.

 [^0-9a-zA-Z] except alphanumeric symbol.

package RegularExcpression;

import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {


int count = 0;

Pattern p = [Link]("[abc]");
Matcher m = [Link]("a3b#k@9z"); // Mathcer class present
in Pattern class

while([Link]()) {
count++;
[Link]([Link]() + " " + " " + [Link]());
//start index, end index and which group is matched(ab)
}

[Link]("The total number of occurance is: " +


count);
}
}

Output:

0 a
2 b
The total number of occurance is: 2

[abc] [^abc] [a-z] [0-9] [a-zA-Z0-9] [^a-zA-Z0-9]


0 a 1 3 0 a 1 3 0 a 3 #
2 b 3 # 2 b 6 9 1 3 5 @
4 k 4 k 2 6
5 @ 7 z 4 k
6 9 6 9
7 z 7 z

Predefined character classes:


 \s Space character.
 \S Except space character
 \d Any digit from 0 to 9 [0-9]
 \D Except digit, any character
 \w Any word character [0-9a-zA-Z]]
 \W Except word character (special
character)

 . Any character

package RegularExcpression;

import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {


int count = 0;
Pattern p = [Link]("\\s");
Matcher m = [Link]("a7b #k@9z"); // Mathcer class present
in Pattern class

while([Link]()) {
count++;
[Link]([Link]() + " " + " " + [Link]());
//start index, end index and which group is matched(ab)
}

[Link]("The total number of occurance is: " +


count);
}

Output:

3
The total number of occurance is: 1

\\s \\S \\d \\D \\w \\W .


3 0 a 1 7 0 a 0 a 3 . . 0 a
1 7 6 9 2 b 1 7 5 @ 1 7
2 b 3…. 2 b 2 b
4 k 4 k 4 k 3 . .
5 @ 5 @ 6 9 4 k
6 9 7 z 7 z 5 @
7 z 6 9
7 z

Quantifiers:
We can use quantifiers to specify number of occurrences to match.

 a Exactly one 'a'.


 a+ Atleast one 'a'.
 a* Any number of a's including zero.
 a? Atmost one 'a'.
package RegularExcpression;

import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {


int count = 0;

Pattern p = [Link]("a");
Matcher m = [Link]("abaabaab"); // Mathcer class present
in Pattern class

while([Link]()) {
count++;
[Link]([Link]() + " " + " " + [Link]());
//start index, end index and which group is matched(ab)
}

[Link]("The total number of occurance is: " +


count);
}

}
Output:

0 a
2 a
3 a
5 a
6 a
The total number of occurance is: 5
'a' 'a+' 'a*' 'a?'
0 a 0 a 0 a 0 a
2 a 2 aa 1 . . 1 . .
3 a 5 aaa 2 aa 2 a
5 a 4 . . 3 a
6 a 5 aaa 4 . .
7 a 8 . . 5 a
9 . . 6 a
7 a
8 . .
9 . .
Pattern class split() method:

 We can use Pattern class split() method to split the target


String according to a particular pattern.

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("\\s");
String[] s = [Link]("durga software solution");

for(String s1 : s) {
[Link](s1);
}
}

Output:

durga
software
solution

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("a");
String[] s = [Link]("durga software solution");

for(String s1 : s) {
[Link](s1);
}
}

}
Output:

durg
softw
re solution

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("o"); //o aate hi split ho


jaayega 'o' print nhi hoga baki sb print hoga including space
String[] s = [Link]("durga software solution");

for(String s1 : s) {
[Link](s1);
}
}

}
Output:

durga s
ftware s
luti
n

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("\\.");
String[] s = [Link]("[Link]");
for(String s1 : s) {
[Link](s1);
}
}

}
Output:

www
durgasoftware
com

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("[.]");
String[] s = [Link]("[Link]");

for(String s1 : s) {
[Link](s1);
}
}

}
Output:

www
durgasoftware
com
 String class also contains split method to split the target
string according to a particular pattern.

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

String s = new String("durga software solution");


String[] s1 = [Link]("\\s");

for(String s2 : s1) {
[Link](s2);
}
}

}
Output:

durga
software
solution

Note:

Pattern class split() method can take target string as argument, where
as string class split() method can take pattern as argument.
String Tokenizer:
 It is a specially designed class for tokenization activity.
 String tokenizer present in [Link] package.

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

StringTokenizer st = new StringTokenizer("durga software


solution");

while ([Link]()) {
[Link]([Link]());
}
}

}
Output:

durga
software
solution

 The default regular expression for String Tokenizer class is


space.

package RegularExcpression;

import [Link];

public class RegExDemo {

public static void main(String[] args) {

StringTokenizer st = new StringTokenizer("20-12-2022", "-");

while ([Link]()) { // target string pattern


[Link]([Link]());
}
}

Output:

20
12
2022

Write a regular expression to represent all valid ten digit mobile


numbers.

Rules:

 Every number should contain exactly 10 digits.


 The 1st digit should 7 or 8 or 9.

[789][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]

Or

[7-9][0-9]{9}

10 digits/ 11 digits:

0?[7-9][0-9]{9} ((?) symbol means we can take 0 in count or ignore it)

10 digits/ 11 digits/ 12 digits:

(0/91)?[7-9][0-9]{9}

Mail-id:

S123_xzs.k@[Link]

Regular expression:

[a-zA-Z0-9][a-zA-Z0-9_.]*@[a-zA-Z0-9]+([.][a-zA-Z]+)+
Write a program to check whether the given mobile number is valid or
not.

package RegularExcpression;

import [Link];
import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("(0/91)[7-9][0-9]{9}");
Matcher m = [Link](args[0]);
if ([Link]() && [Link]().equals(args[0])) {
[Link]("Valid mobile number");
} else {
[Link]("Invalid mobile number");
}
}

Write a program to check whether the given mail-id is valid or not.

Change the mobile number regular expression with mail-id regular


expression.

package RegularExcpression;

import [Link];
import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) {

Pattern p = [Link]("[a-zA-Z0-9][a-zA-Z0-9_.]*@[a-
zA-Z0-9]+([.][a-zA-Z]+)+");
Matcher m = [Link](args[0]);
if ([Link]() && [Link]().equals(args[0])) {
[Link]("Valid mail-id");
} else {
[Link]("Invalid mail-id");
}
}

Write a program to read all mobile number present in given '[Link]'


file where mobile numbers are mixed with normal data.

package RegularExcpression;

import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) throws IOException {


PrintWriter pw = new PrintWriter("[Link]");
Pattern p = [Link]("(0|91)?[7-9][0-9]{9}");
BufferedReader br = new BufferedReader(new
FileReader("[Link]"));
String line = [Link]();

while(line != null) {
Matcher m = [Link](line);
while([Link]()) {
[Link]([Link]());
}
[Link]();
}
[Link]();
[Link]();
[Link]();

}
}

Write a program to read all mail-id's present in given '[Link]'


file where mobile numbers are mixed with normal data.

package RegularExcpression;

import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];

public class RegExDemo {

public static void main(String[] args) throws IOException {


PrintWriter pw = new PrintWriter("[Link]");
Pattern p = [Link]("[a-zA-Z0-9][a-zA-Z0-9_.]*@[a-
zA-Z0-9]+([.][a-zA-Z]+)+");
BufferedReader br = new BufferedReader(new
FileReader("[Link]"));
String line = [Link]();

while (line != null) {


Matcher m = [Link](line);
while ([Link]()) {
[Link]([Link]());
}
[Link]();
}
[Link]();
[Link]();
[Link]();

}
Write a program to display all .txt file present in E:\\TypingMaster:

package RegularExcpression;

import [Link].*;
import [Link].*;

public class RegExDemo {

public static void main(String[] args) throws IOException {


Pattern p = [Link]("[a-zA-Z0-9_.$]+[.]txt");
File f = new File("E:\\TypingMaster");
String[] s = [Link]();
int count = 0;
for (String s1 : s) {
Matcher m = [Link](s1);
if ([Link]() && [Link]().equals(s1)) {
count++;
[Link](s1);
}
}
[Link]("The total number: " + count);
}

Write a program to display txt/gif file file present in E:\\


TypingMaster:

package RegularExcpression;

import [Link].*;
import [Link].*;

public class RegExDemo {

public static void main(String[] args) throws IOException {


Pattern p = [Link]("[a-zA-Z0-9_.$]+[.](txt|gif)");
File f = new File("E:\\TypingMaster");
String[] s = [Link]();
int count = 0;
for (String s1 : s) {
Matcher m = [Link](s1);
if ([Link]() && [Link]().equals(s1)) {
count++;
[Link](s1);
}
}
[Link]("The total number: " + count);
}

Output:

[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
[Link]
The total number: 11

Common questions

Powered by AI

The Pattern class split() method uses a compiled pattern to divide a target string, while the String class split() method uses a regular expression pattern directly within its argument. The former offers more control and efficiency for complex patterns as it precompiles them, whereas the latter is more suited for simpler, one-time use scenarios .

Regular expressions can aid communication protocols like TCP/IP by validating and transforming data formats. They ensure data adheres to protocol specifications, manage parsing tasks, and automate the matching of communication patterns. This improves the accuracy and efficiency of data transmission and protocol development .

Regular expressions are mainly used for developing validation frameworks, pattern matching applications, translators like assemblers and compilers, digital circuits, and communication protocols like TCP/IP and UDP .

In Java, the Pattern class is used to define a compiled version of a regular expression. A Matcher object is then created to find matches between a given pattern and a target string using methods like find(), start(), end(), and group(). The start() method returns the start index of the matched substring, end() returns the end index plus one, and group() returns the matched part of the target string .

Quantifiers in regular expressions specify the number of occurrences of a character or group. For example, 'a' matches exactly one 'a', 'a+' matches at least one 'a', 'a*' matches zero or more 'a's, and 'a?' matches at most one 'a'. These quantifiers allow for flexible pattern definitions when searching strings .

Predefined character classes in regular expressions provide shorthand notations for common sets of characters, enhancing readability and reducing complexity in pattern definitions. Examples include '\s' for whitespace, '\d' for digits, '\w' for word characters, and their respective negations like '\S' and '\D' for non-whitespace and non-digit characters, respectively .

The split() method in Java's Pattern class divides a target string into an array based on a regular expression. It segments the string wherever the pattern matches, effectively removing matches from the split output. For example, using '\s' as the pattern would split a string based on whitespace, and using 'o' would split around the letter 'o' .

Regular expressions validate mobile numbers by matching specific digit patterns. For example, a ten-digit mobile number starting with 7, 8, or 9 can be represented by the pattern '[7-9][0-9]{9}'. Variations can account for international formats, such as using '(0|91)?[7-9][0-9]{9}' to optionally include a country code prefix .

The StringTokenizer class in Java is used for breaking a string into tokens based on delimiters, with space as the default delimiter. Unlike regex-based split methods, StringTokenizer doesn't support complex pattern matching. It's simpler but less flexible, suitable for more basic splitting tasks where regex capabilities are unnecessary .

Regular expressions can identify and print files with specific extensions by matching filenames against a pattern. For example, the pattern '[a-zA-Z0-9_.$]+[.](txt|gif)' could be used to list all '.txt' and '.gif' files by searching a directory and checking if filenames match the regex, allowing for flexible directory searches .

Regular Expression

If we want to represent a group of strings according to a 
particular pattern then we should go for regu
System.out.println("The total number of occurance is: " + 
count);
}
}
Output:
0
3
The total number of occurance is: 2
packag
Output:
0 2 ab
3 5 ab
The total number of occurance is: 2
Pattern:

A pattern object is a compiled version of regular expres
Character classes:

[abc]
either 'a' or 'b' or 'c'

[^abc]
except 'a' and 'b' and 'c'

[a-z]
any lower case alphabet symbo
}
Output:
0  a
2  b
The total number of occurance is: 2
[abc]
[^abc]
[a-z]
[0-9]
[a-zA-Z0-9]
[^a-zA-Z0-9]
0  a
2  b
1  3
3  #
Pattern p = Pattern.compile("s"); 
Matcher m = p.matcher("a7b #k@9z"); // Mathcer class present
in Pattern class
while(m.fi
package RegularExcpression;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExDemo {
public s
Pattern class split() method:

We can use Pattern class split() method to split the target 
String according to a particular
}
}
}
Output:
durg
 softw
re solution
package RegularExcpression;
import java.util.regex.Pattern;
public class RegExDemo {
pu
for(String s1 : s) {
System.out.println(s1);
}
}
}
Output:
www
durgasoftware
com
package RegularExcpression;
import java.util

You might also like