How to use StringTokenizer to parse Strings into tokens in Java

How to use StringTokenizer to parse Strings into tokens in Java | Tutorial with examples

This tutorial explains how to use java.util.StringTokenizer class to parse a String containing delimited data tokens. It first explains what a StringTokenizer does along with the basic concepts of delimiters and tokens. Next it uses a Java code example to show how to code using a StringTokenizer.
What does java.util.StringTokenizer class do StringTokenizer class breaks a given String containing data into smaller tokens. To do so it uses the concept of delimiters. The words in bold - tokens and delimiters are the key terms one needs to understand to use StringTokenizer. Take a look at the diagram below and then the two definitions which follow -

Diagram explaining delimiters and tokens for StringTokenizer in Java

What is a token? The portions of the string between two delimiters is a token. Tokens contain the actual information which we want to extract from the input string.

What is a delimiter? The character/group of characters marking the end of the token is a delimiter.
So, given a String containing data, if we want to read the tokens present in this string using the StringTokenizer class, then there needs to be a delimiter defined for separating the tokens. We can then instruct the StringTokenizer class to extract the tokens from between these delimiters.

How extract tokens from a String using StringTokenizer StringTokenizer class has provided a set of methods to iterate through and extract the tokens read from the input String. Out of the methods provided by StringTokenizer you will be using, for all practical purposes, the following 2 methods for most of the plausible scenarios -

hasMoreTokens(): This method returns a boolean value indicating whether any more 'unprocessed' tokens are present in the input string.
nextToken(): This method returns the next String token.

Functions hasMoreTokens() and nextToken() work in tandem to move through the tokens in a very similar fashion to the way hasNext() and next() methods of Iterator interface work together. You need to keep checking whether you have any tokens remaining using hasMoreTokens() method before accessing the next token. Just like an Iterator, StringTokenizer maintains an internal pointer to the next token to be read. A call to nextToken() reads the token the pointer is currently pointing to and moves the pointer ahead so that it now points to the next token.

If you try to access the next token using nextToken() without checking for hasMoreTokens(), then you run the risk of java.util.NoSuchElementException being thrown in the scenario when there are no more tokens remaining. So, as an accepted practice, a call to nextToken() is always preceded by a check for the next token's existence using hasMoreTokens().

Let us now see the StringTokenizer in action. In the Java code example that follows we will be extracting 5 names(tokens) which are delimited by commas(delimiter).

Java code example showing StringTokenizer usage

package com.javabrahman.corejava;
import java.util.StringTokenizer;
public class StringTokenizerExample {

  public static void main(String args[]) {
    String rawData="John,David,George,Frank,Tom";
    StringTokenizer tokenizer=new StringTokenizer(rawData,",");
    while(tokenizer.hasMoreTokens()){
      System.out.println(tokenizer.nextToken());
    }
  }
}

OUTPUT of the above code

John
David
George
Frank
Tom

Explanation of the code

In the StringTokenizerExample class's main() method we first create a string containing the comma-delimited names which is named rawData.
Next we create a StringTokenizer instance, named tokenizer, using its 2-parameter constructor. The first parameter is the string rawData, while its second parameter is the delimiter i.e. ','(comma).
We then create an infinite while loop which executes till the method hasMoreToken() returns true, i.e. till there are tokens remaining to be read from tokenizer.
Inside the loop we keep getting the next token values, i.e. names, and we keep printing them. After each nextToken() call tokenizer’s internal pointer moves ahead to point to the next token. This sequence of token fetching, moving forward, and re-entering the loop continues till hasMoreToken() returns a false value and the loop ends.
The output of the above program is as expected - the 5 names printed in 5 lines(Note - Each name is printed in a separate line since we used println() method to print them).

Summary In the above tutorial, we understood the basic concepts of tokens and delimiters which are used by StringTokenizer class in parsing a given input String to retrieve values(sub-strings) stored in it. We then saw the two main methods of StringTokenizer, and then saw a Java example showing the StringTokenizer and its methods in action.

Click on a category to view all articles