StringTokenizer
Splitting strings into tokens
5 min read
StringTokenizer Class
StringTokenizer is a legacy class for breaking strings into tokens based on delimiters. While still functional, String.split() is preferred in modern Java.
Constructor Options
StringTokenizer(String str)- Default delimiters (space, tab, newline)StringTokenizer(String str, String delim)- Custom delimitersStringTokenizer(String str, String delim, boolean returnDelims)- Include delimiters as tokens
ā ļø Legacy Class: StringTokenizer is a legacy class. For new code, prefer String.split() or Pattern.split() which are more powerful and support regex.
Key Methods
hasMoreTokens()- Check if more tokens existnextToken()- Get next tokencountTokens()- Count remaining tokens
Code Examples
Basic StringTokenizer with default delimiters
java
1import java.util.StringTokenizer;
2
3// Basic usage with default delimiters (space, tab, newline)
4String sentence = "Hello World Java Programming";
5StringTokenizer st = new StringTokenizer(sentence);
6
7while (st.hasMoreTokens()) {
8 System.out.println(st.nextToken());
9}
10// Output:
11// Hello
12// World
13// Java
14// Programming
15
16// Count tokens
17StringTokenizer st2 = new StringTokenizer(sentence);
18System.out.println("Token count: " + st2.countTokens()); // 4Custom and multiple delimiters
java
1// Custom delimiters
2String csv = "apple,banana,cherry,date";
3StringTokenizer st = new StringTokenizer(csv, ",");
4
5while (st.hasMoreTokens()) {
6 System.out.println(st.nextToken());
7}
8// Output: apple, banana, cherry, date
9
10// Multiple delimiters
11String mixed = "one;two,three:four";
12StringTokenizer st2 = new StringTokenizer(mixed, ";,:");
13
14while (st2.hasMoreTokens()) {
15 System.out.println(st2.nextToken());
16}
17// Output: one, two, three, four
18
19// Include delimiters as tokens
20String math = "10+20-5*2";
21StringTokenizer st3 = new StringTokenizer(math, "+-*", true);
22
23while (st3.hasMoreTokens()) {
24 System.out.println(st3.nextToken());
25}
26// Output: 10, +, 20, -, 5, *, 2Modern alternatives to StringTokenizer
java
1// Modern alternatives (PREFERRED)
2
3String sentence = "Hello World Java";
4
5// 1. String.split() - Most common
6String[] words = sentence.split(" ");
7for (String word : words) {
8 System.out.println(word);
9}
10
11// 2. Split with regex
12String csv = "apple, banana, cherry";
13String[] items = csv.split(",\\s*"); // Comma + optional whitespace
14// ["apple", "banana", "cherry"]
15
16// 3. Stream API (Java 8+)
17Arrays.stream("one,two,three".split(","))
18 .forEach(System.out::println);
19
20// 4. Pattern.split() for complex patterns
21Pattern pattern = Pattern.compile("[,;:]");
22String[] parts = pattern.split("a,b;c:d");
23// ["a", "b", "c", "d"]
24
25// 5. Scanner for advanced parsing
26Scanner scanner = new Scanner("10 20 30");
27while (scanner.hasNextInt()) {
28 System.out.println(scanner.nextInt());
29}Practical use cases and gotchas
java
1// Practical examples
2
3// 1. Parse key-value pairs
4String config = "name=John;age=30;city=NYC";
5StringTokenizer st = new StringTokenizer(config, ";");
6Map<String, String> map = new HashMap<>();
7
8while (st.hasMoreTokens()) {
9 String[] pair = st.nextToken().split("=");
10 map.put(pair[0], pair[1]);
11}
12
13// 2. Word count
14public static int countWords(String text) {
15 return new StringTokenizer(text).countTokens();
16}
17
18// 3. Convert to List (modern approach)
19String input = "red,green,blue";
20List<String> colors = Arrays.asList(input.split(","));
21
22// 4. Handle empty tokens (StringTokenizer skips them!)
23String data = "a,,b,c";
24// StringTokenizer: "a", "b", "c" (skips empty!)
25// split: "a", "", "b", "c" (includes empty)
26
27String[] withEmpty = data.split(",", -1); // -1 keeps trailing empty
28// ["a", "", "b", "c"]Use Cases
- Simple string splitting
- Parsing delimited data
- Word counting
- Legacy code maintenance
- Configuration parsing
- Quick tokenization needs
Common Mistakes to Avoid
- Using StringTokenizer when split() is cleaner
- Not knowing StringTokenizer skips empty tokens
- Forgetting it's a legacy class
- Not handling NoSuchElementException
- Using for complex parsing (use regex instead)
- Calling countTokens() multiple times (consumes tokens)