Day 08: str_squish()

Remove trailing, leading, and excess white space in the middle with stringr’s str_squish().

Published

December 8, 2022

As mentioned in the overview to the stringr package (Wickham 2022), strings might not be the most glamorous part of using R, but they sure do play a big role when it comes to data cleaning.

This will be a quick hit, because str_squish() is quite simple (but also incredibly helpful, especially with wild-caught data). It does three things (the first two of which can also be done with its close cousin, str_trim()):

  1. Removes white space at the start of the string,
  2. Removes white space at the end of the string, and
  3. Replaces internal white space with a single space.

Let’s look at a couple of examples.

library(stringr)
str_squish(" this   is my  string   ")
[1] "this is my string"

Above, you can see that the leading and trailing space is now gone, and the extra spaces in the middle have been reduced to a single space in each instance.

This also works with other white space characters, such as tabs and new lines.

str_squish("\n\nthis is   another\t string\t\n")
[1] "this is another string"

In addition to removing the leading and trailing white space, the extra spaces and other white space characters in the middle are now single spaces.

Learn more

We’ve pretty much covered the extent of str_squish() here, but if you want to learn more about string manipulation with stringr in general, check out the Strings chapter in R for Data Science (Wickham, Grolemund, and Çetinkaya-Rundel 2022).

References

Wickham, Hadley. 2022. stringr: Simple, Consistent Wrappers for Common String Operations. https://stringr.tidyverse.org.
Wickham, Hadley, Garrett Grolemund, and Mine Çetinkaya-Rundel. 2022. R for Data Science (2e). Second. O’Reilly. https://r4ds.hadley.nz.