StringTokenizer

Splitting strings into tokens

5 min read

StringTokenizer Class

StringTokenizer is a legacy class for breaking strings into tokens based on delimiters. While still functional, String.split() is preferred in modern Java.

Constructor Options

  • StringTokenizer(String str) - Default delimiters (space, tab, newline)
  • StringTokenizer(String str, String delim) - Custom delimiters
  • StringTokenizer(String str, String delim, boolean returnDelims) - Include delimiters as tokens

āš ļø Legacy Class: StringTokenizer is a legacy class. For new code, prefer String.split() or Pattern.split() which are more powerful and support regex.

Key Methods

  • hasMoreTokens() - Check if more tokens exist
  • nextToken() - Get next token
  • countTokens() - Count remaining tokens

Code Examples

Basic StringTokenizer with default delimiters

java
1import java.util.StringTokenizer;
2
3// Basic usage with default delimiters (space, tab, newline)
4String sentence = "Hello World Java Programming";
5StringTokenizer st = new StringTokenizer(sentence);
6
7while (st.hasMoreTokens()) {
8    System.out.println(st.nextToken());
9}
10// Output:
11// Hello
12// World
13// Java
14// Programming
15
16// Count tokens
17StringTokenizer st2 = new StringTokenizer(sentence);
18System.out.println("Token count: " + st2.countTokens());  // 4

Custom and multiple delimiters

java
1// Custom delimiters
2String csv = "apple,banana,cherry,date";
3StringTokenizer st = new StringTokenizer(csv, ",");
4
5while (st.hasMoreTokens()) {
6    System.out.println(st.nextToken());
7}
8// Output: apple, banana, cherry, date
9
10// Multiple delimiters
11String mixed = "one;two,three:four";
12StringTokenizer st2 = new StringTokenizer(mixed, ";,:");
13
14while (st2.hasMoreTokens()) {
15    System.out.println(st2.nextToken());
16}
17// Output: one, two, three, four
18
19// Include delimiters as tokens
20String math = "10+20-5*2";
21StringTokenizer st3 = new StringTokenizer(math, "+-*", true);
22
23while (st3.hasMoreTokens()) {
24    System.out.println(st3.nextToken());
25}
26// Output: 10, +, 20, -, 5, *, 2

Modern alternatives to StringTokenizer

java
1// Modern alternatives (PREFERRED)
2
3String sentence = "Hello World Java";
4
5// 1. String.split() - Most common
6String[] words = sentence.split(" ");
7for (String word : words) {
8    System.out.println(word);
9}
10
11// 2. Split with regex
12String csv = "apple, banana,  cherry";
13String[] items = csv.split(",\\s*");  // Comma + optional whitespace
14// ["apple", "banana", "cherry"]
15
16// 3. Stream API (Java 8+)
17Arrays.stream("one,two,three".split(","))
18    .forEach(System.out::println);
19
20// 4. Pattern.split() for complex patterns
21Pattern pattern = Pattern.compile("[,;:]");
22String[] parts = pattern.split("a,b;c:d");
23// ["a", "b", "c", "d"]
24
25// 5. Scanner for advanced parsing
26Scanner scanner = new Scanner("10 20 30");
27while (scanner.hasNextInt()) {
28    System.out.println(scanner.nextInt());
29}

Practical use cases and gotchas

java
1// Practical examples
2
3// 1. Parse key-value pairs
4String config = "name=John;age=30;city=NYC";
5StringTokenizer st = new StringTokenizer(config, ";");
6Map<String, String> map = new HashMap<>();
7
8while (st.hasMoreTokens()) {
9    String[] pair = st.nextToken().split("=");
10    map.put(pair[0], pair[1]);
11}
12
13// 2. Word count
14public static int countWords(String text) {
15    return new StringTokenizer(text).countTokens();
16}
17
18// 3. Convert to List (modern approach)
19String input = "red,green,blue";
20List<String> colors = Arrays.asList(input.split(","));
21
22// 4. Handle empty tokens (StringTokenizer skips them!)
23String data = "a,,b,c";
24// StringTokenizer: "a", "b", "c" (skips empty!)
25// split: "a", "", "b", "c" (includes empty)
26
27String[] withEmpty = data.split(",", -1);  // -1 keeps trailing empty
28// ["a", "", "b", "c"]

Use Cases

  • Simple string splitting
  • Parsing delimited data
  • Word counting
  • Legacy code maintenance
  • Configuration parsing
  • Quick tokenization needs

Common Mistakes to Avoid

  • Using StringTokenizer when split() is cleaner
  • Not knowing StringTokenizer skips empty tokens
  • Forgetting it's a legacy class
  • Not handling NoSuchElementException
  • Using for complex parsing (use regex instead)
  • Calling countTokens() multiple times (consumes tokens)