![]() |
Perl to Java regular expressions tutorial |
|
||||||
| Get a group from a string |
$text = "Asian.lst";
if ($text =~ m#(.*)\.lst#)
{
$filename = $1;
}
# $filename is "Asian" now
|
String text = "Asian.lst";
String filename = null;
Matcher matcher = Pattern.compile("(.*)\\.(.*)")
.matcher(text);
if (matcher.lookingAt())
{
filename = matcher.group(1);
}
|
| Download | Download |
\\. is written. But if you happened to read from a file, you must use normal escaping
(\.), this double escaping only stands for literals.
| Match a string against a regexp |
$text = "Asian.lst";
if ($text =~ m#ian#)
{
print "contains 'ian'\n";
}
|
String text = "Asian.lst";
if (text.matches("^.*ian.*$"))
{
System.out.println("contains 'ian'");
}
|
| Download | Download |
matches() tries to match the WHOLE string, so matches("ian")
actually means matches("^ian$"),
which wouldn't find anything in this case! Remember: matches("") means matches("^$"),
so if you want to look for
a substring, expand the "^$" regexp. Ex.: does the string contain "ian"? The regexp is:
"^.*ian.*$", that is:
anything, "ian", anything, where anything can also be an empty string.
| read a file line by line and match a regexp against each line |
open (F1, "<input.txt") || die();
while (<F1>)
{
chomp;
if (m#dog#i) {
print $_, "\n";
}
}
close F1;
|
BufferedReader br = new BufferedReader(new FileReader("input.txt"));
Pattern pattern = Pattern.compile("(?i)dog");
Matcher matcher;
String line;
while ( (line = br.readLine()) != null )
{
matcher = pattern.matcher(line);
if (matcher.find()) {
System.out.println(line);
}
}
|
input.txt:
yo dogdog a dog is here a cat is here Snoop Doggy Dog pussycat |
output:
dogdog a dog is here Snoop Doggy Dog |
| Download | Download |
"(?i)dog" (?i) means: ignore case. This is a modifier, and you always
have to put it at the beginning of the search string. For more info about modifiers read the tutorial
mentioned above.
| replace the first, then all the occurences of a substring in a string |
my $text = "a dog and a dog"; $text =~ s#dog#cat#; # "a cat and a dog" $text =~ s#dog#cat#g; # "a cat and a cat" |
String text = "a dog and a dog";
String text1 = text.replaceFirst("dog","cat");
String text2 = text.replaceAll("dog","cat");
|
| Download | Download |
| find all the occurences of a substring in a string |
my $text = '<a href="ad1">sdqs</a><a href="ad2">sds</a><a href=ad3>qs</a>';
while ($text =~ m#href="?(.*?)"?>#gi)
{
print $1, "\n";
}
# result:
# ad1
# ad2
# ad3
|
String text = "<a href=\"ad1\">sdqs</a><a href=\"ad2\">sds</a><a href=ad3>qs</a>";
Pattern pattern = Pattern.compile("(?i)href=\"?(.*?)\"?>");
Matcher matcher = pattern.matcher(text);
while (matcher.find())
{
System.out.println(matcher.group(1));
}
|
| Download | Download |
Credits ( Top )
For this tutorial I used Scott A. Hommel's "Regular Expressions" lesson, which is available at
http://java.sun.com/docs/books/tutorial/extra/regex/index.html.
How to contact me ( Top )
If you have any ideas how to improve this tutorial, then send me your suggestions/patches
to szathml@delfin.unideb.hu.
|
|
||||||||