The StringTokenizer class in Java is a legacy utility class used to break a string into tokens, which are smaller parts or substrings, based on specified delimiters. It is part of the java.util package. Though considered somewhat outdated in favor of more flexible options like String.split() or Scanner, it is still useful in some contexts.
Features of StringTokenizer
- Breaks a string into tokens based on one or more delimiters.
- Tokens are extracted sequentially.
- By default, uses whitespace as the delimiter.
- Does not support regular expressions for delimiters.
- Methods are not thread-safe.
Constructors
- StringTokenizer(String str)
- Splits the string str using the default delimiter (whitespace).
- StringTokenizer(String str, String delim)
- Splits the string str using the specified delimiters in delim.
- StringTokenizer(String str, String delim, boolean returnDelims)
- Splits the string str using the specified delimiters and includes the delimiters as tokens if returnDelims is true.
Common Methods
Method | Description |
hasMoreTokens() | Returns true if there are more tokens available. |
nextToken() | Returns the next token in the string. |
nextToken(String delim) | Returns the next token using a new set of delimiters. |
countTokens() | Returns the total number of tokens available in the string. |
Examples
1. Basic Tokenization
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String[] args) {
String text = “Java is fun”;
StringTokenizer tokenizer = new StringTokenizer(text);
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
}
}
Output:
Java
is
fun
2. Using Custom Delimiters
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String[] args) {
String text = “apple,banana;cherry”;
StringTokenizer tokenizer = new StringTokenizer(text, “,;”);
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
}
}
Output:
apple
banana
cherry
3. Including Delimiters as Tokens
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String[] args) {
String text = “apple,banana;cherry”;
StringTokenizer tokenizer = new StringTokenizer(text, “,;”, true);
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
}
}
Output:
apple
,
banana
;
cherry
4. Counting Tokens
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String[] args) {
String text = “Java|Python|C++”;
StringTokenizer tokenizer = new StringTokenizer(text, “|”);
System.out.println(“Number of tokens: ” + tokenizer.countTokens());
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
}
}
Output:
Number of tokens: 3
Java
Python
C++
Comparison with Other Tokenization Techniques
Feature | StringTokenizer | String.split() | Scanner |
Thread-Safe | No | No | No |
Supports Regex | No | Yes | No |
Includes Delimiters | Optional | No | No |
Flexibility | Limited (basic delimiters only) | High (supports regex and various patterns) | High (input parsing and formatting) |
Example of String.split()
public class SplitExample {
public static void main(String[] args) {
String text = “Java|Python|C++”;
String[] tokens = text.split(“\\|”);
for (String token : tokens) {
System.out.println(token);
}
}
}
Advantages of StringTokenizer
- Simple and easy to use for basic tokenization.
- Lightweight and faster than split() for simple cases without regex.
- Can include delimiters as tokens if needed.
Disadvantages of StringTokenizer
- Does not support regular expressions for delimiters.
- Considered a legacy class; String.split() and Scanner are preferred for new development.
- Not thread-safe.
Use Cases
- When splitting strings based on simple delimiters (e.g., CSV parsing with fixed delimiters like commas).
- Lightweight tokenization where performance is critical and regular expressions are not required.
- Legacy codebases where StringTokenizer is already in use.
Conclusion
While the StringTokenizer class is straightforward and efficient for basic tokenization, its limitations (e.g., lack of regex support and being non-thread-safe) make it less ideal for modern Java development. In most cases, String.split() or Scanner is a better alternative. However, StringTokenizer remains a quick and lightweight option for simple tokenization tasks.