a nice photo of my face :D

Perl to Java regular expressions tutorial



  1. Credits
  2. Contact


This tiny tutorial is for those who are familiar with regular expressions in Perl and want to use them in Java too. First I recommend you to read Scott A. Hommel's excellent tutorial, and then you can come back here to find those things summed up. I won't explain everything, to get acquainted with the basics, start with the other.
I'll give some examples in Perl on the left side and their Java equivalent on the right side.

Get a group from a string
$text = "Asian.lst";
if ($text =~ m#(.*)\.lst#)
{
   $filename = $1;
}
# $filename is "Asian" now
String text = "Asian.lst";
String filename = null;
Matcher matcher = Pattern.compile("(.*)\\.(.*)")
                         .matcher(text);
if (matcher.lookingAt())
{
   filename = matcher.group(1);
}
Download Download
Note: when you use an escape sequence in a String literal, you have to double escape it, that's why \\. is written. But if you happened to read from a file, you must use normal escaping (\.), this double escaping only stands for literals.

Match a string against a regexp
$text = "Asian.lst";
if ($text =~ m#ian#)
{
   print "contains 'ian'\n";
}
String text = "Asian.lst";
if (text.matches("^.*ian.*$"))
{
   System.out.println("contains 'ian'");
}
Download Download
Note: Java is tricky here as matches() tries to match the WHOLE string, so matches("ian") actually means matches("^ian$"), which wouldn't find anything in this case! Remember: matches("") means matches("^$"), so if you want to look for a substring, expand the "^$" regexp. Ex.: does the string contain "ian"? The regexp is: "^.*ian.*$", that is: anything, "ian", anything, where anything can also be an empty string.

read a file line by line and match a regexp against each line
open (F1, "<input.txt") || die();
while (<F1>)
{
   chomp;
   if (m#dog#i) {
      print $_, "\n";
   }
}
close F1;
BufferedReader br = new BufferedReader(new FileReader("input.txt"));
Pattern pattern = Pattern.compile("(?i)dog");
Matcher matcher;
String line;

while ( (line = br.readLine()) != null )
{
   matcher = pattern.matcher(line);
   if (matcher.find()) {
      System.out.println(line);
   }
}
input.txt:
yo
dogdog
a dog is here
a cat is here
Snoop Doggy Dog
pussycat
output:
dogdog
a dog is here
Snoop Doggy Dog
Download Download
Note: in the Java example in "(?i)dog" (?i) means: ignore case. This is a modifier, and you always have to put it at the beginning of the search string. For more info about modifiers read the tutorial mentioned above.

replace the first, then all the occurences of a substring in a string
my $text = "a dog and a dog";

$text =~ s#dog#cat#;
# "a cat and a dog"

$text =~ s#dog#cat#g;
# "a cat and a cat"
String text  = "a dog and a dog";

String text1 = text.replaceFirst("dog","cat");

String text2 = text.replaceAll("dog","cat");
Download Download

find all the occurences of a substring in a string
my $text = '<a href="ad1">sdqs</a><a href="ad2">sds</a><a href=ad3>qs</a>';

while ($text =~ m#href="?(.*?)"?>#gi)
{
   print $1, "\n";
}
# result:
# ad1
# ad2
# ad3
String text = "<a href=\"ad1\">sdqs</a><a href=\"ad2\">sds</a><a href=ad3>qs</a>";
Pattern pattern = Pattern.compile("(?i)href=\"?(.*?)\"?>");
Matcher matcher = pattern.matcher(text);

while (matcher.find())
{
   System.out.println(matcher.group(1));
}
Download Download

Credits ( Top )
For this tutorial I used Scott A. Hommel's "Regular Expressions" lesson, which is available at http://java.sun.com/docs/books/tutorial/extra/regex/index.html.

How to contact me ( Top )
If you have any ideas how to improve this tutorial, then send me your suggestions/patches to szathml@delfin.unideb.hu.


Created by Laszlo Szathmary, alias Jabba Laci, 2003--2005