Remove Non-alphabetic Characters From String Array Example
1. Introduction
Removing non-alphabetic characters from a string is useful for an application that includes text search, match, and analysis. In this example, I will show four ways to remove non-alphabetic characters string:
- via
String.replaceAllmethod with regular expressions. - via character filtering with java.util.Stream.
- via StringBuilder from the java.lang package to append the alphabetic characters.
- via RegExUtils.replaceAll from Apache Commons Lang.
2. Setup
In this step, I will create a maven project with both Apache Commons Lang and Junit 5 libraries.
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>org.zheng</groupId> <artifactId>t</artifactId> <version>0.0.1-SNAPSHOT</version> <dependencies> <!-- https://mvnrepository.com/artifact/org.apache.commons/commons-lang3 --> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.17.0</version> </dependency> <!-- https://mvnrepository.com/artifact/org.junit.jupiter/junit-jupiter-api --> <dependency> <groupId>org.junit.jupiter</groupId> <artifactId>junit-jupiter-api</artifactId> <version>5.11.4</version> <scope>test</scope> </dependency> </dependencies> </project>
3. Remove All Non-alphabetic Characters String
In this step, I will create a RemoveNonAlphabeticUtil.java class that includes four methods to remove non-alphabetic characters.
viaCharacterutilizes theStringBuilderto append the alphabetic characters from the character intoCharArray.viaRegExUtils_replaceAlluses thereplaceAllmethod fromorg.apache.commons.lang3.RegExUtils.viaString_replaceAll_RegExuses thereplaceAllmethod fromjava.lang.String.viaStreamfilters out any non-alphabetic character.
RemoveNonAlphabeticUtil.java
package org.zheng.demo;
import java.util.Arrays;
import org.apache.commons.lang3.RegExUtils;
public class RemoveNonAlphabeticUtil {
private static final String NON_ALPHA_REGEX = "[^a-zZ-Z]";
public String viaCharacter(final String testStr) {
StringBuilder sb = new StringBuilder();
for (char c : testStr.toCharArray()) {
if (Character.isLetter(c)) {
sb.append(c);
}
}
return sb.toString();
}
public String viaRegExUtils_replaceAll(final String testMsgs) {
return RegExUtils.replaceAll(testMsgs, NON_ALPHA_REGEX, "");
}
public String viaString_replaceAll_RegEx(final String testMsgs) {
return testMsgs.replaceAll(NON_ALPHA_REGEX, "");
}
public String[] viaStream(final String[] stringArray) {
return Arrays.stream(stringArray)
.map(str -> str.chars().filter(Character::isLetter)
.collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append).toString())
.toArray(String[]::new);
}
}
- Line 8: the regular expression
[^a-zA-Z]matches any non-alphabetic character, meaning any character that is neither a lowercase nor an uppercase letter. - Line 13: append only if the
Character.isLetter(c)is true. - Line 21: replace any non-alphabetic character with “” via the
org.apache.commons.lang3.RegExUtils.replaceAllmethod. - Line 26: replace any non-alphabetic character with “” via the
java.lang.String.replaceAllmethod. - Line 32: remove any non-alphabetic character with
Character::isLettervia thejava.util.Stream.filtermethod.
4. Junit Test
In this step, I will create a RemoveNonAlphabeticUtilTest.java to test the four methods defined in step 3.
RemoveNonAlphabeticUtilTest.java
package org.zheng.demo;
import static org.junit.jupiter.api.Assertions.assertEquals;
import org.junit.jupiter.api.Test;
class RemoveNonAlphabeticUtilTest {
String[] stringArray = { "this ", "is:", "some odd!~12323", "characters!" };
RemoveNonAlphabeticUtil testClass = new RemoveNonAlphabeticUtil();
@Test
void test_viaCharacter() {
for (int idx = 0; idx < stringArray.length; idx++) {
stringArray[idx] = testClass.viaCharacter(stringArray[idx]);
}
verifyData(stringArray);
}
@Test
void test_viaRegExUtils_replaceAll() {
for (int idx = 0; idx < stringArray.length; idx++) {
stringArray[idx] = testClass.viaRegExUtils_replaceAll(stringArray[idx]);
}
verifyData(stringArray);
}
@Test
void test_viaString_replaceAll_RegEx() {
for (int idx = 0; idx < stringArray.length; idx++) {
stringArray[idx] = testClass.viaString_replaceAll_RegEx(stringArray[idx]);
}
verifyData(stringArray);
}
@Test
void test_viaStream() {
String[] updatedStrs = testClass.viaStream(stringArray);
verifyData(updatedStrs);
}
private void verifyData(final String[] stringArray) {
assertEquals("this", stringArray[0]);
assertEquals("is", stringArray[1]);
assertEquals("someodd", stringArray[2]);
assertEquals("characters", stringArray[3]);
}
}
- Line 9: defines a test string array { “this “, “is:”, “some odd!~12323”, “characters!” }. Note: there are some non-alphabetic characters: white space, colon(:), exclamation mark(!), tilde(~), and numeric digits(12323).
- Line 48: verifies that the white space is removed.
- Line 49: verifies that the colon is removed.
- Line 50: verifies that the white space, “!”, “~“, and digits are removed.
- Line 51: verifies that the “!” is removed.
5. Demonstration
In this step, I will run the Junit tests and capture the test results.
6. Conclusion
In this example, I created a simple maven project that included a Java class with four methods to remove non-alphabetic characters from a string. Both viaString_replaceAll_RegEx and viaRegExUtils_replaceAll methods utilize replaceAll with a regular expression argument from both org.apache.commons.lang3.RegExUtils and java.lang.String classes.
7. Download
This was an example of a maven project which removed non-alphabetic characters from a string.
You can download the full source code of this example here: Remove Non-alphabetic Characters From String Array Example


