PostgreSQL offers a powerful substring function for manipulating strings. You can extract a section of a string based on predefined criteria by using the substring function in PostgreSQL. It offers a variety of features for a range of tasks.
Table of Contents
SUBSTRING() Function Syntax
There are two syntaxes for the substring function in PostgreSQL.
- SUBSTRING(<string_text>, <starting_position>, <length>)
- SUBSTRING(<string_text> FROM <starting_position> FOR <length>)
Both the syntaxes use the same parameters and provide the same functionality.
- <string_text> is the source string from which the substring will be extracted.
- <starting_position> is the position in the source string when extraction needs to begin.
- <length> is an optional parameter. It indicates how many characters to extract. When omitted, the string is returned until the end.
starting_position count the first character from string at Position 1. For example, if you want to extract “SQL” from “SQL is great” then you have to specify SUBSTRING(‘SQL is great’, 1, 3).
SUBSTRING Function examples
Extract First N letters as an example
If you want to extract the first N letters, you have to specify FROM as 1 and also give the FOR parameter set to N. For example, if you want to extract only the first 7 letters from the string, then you have to specify FROM as 1 and FOR as 7, as shown below.
Also read: Easy way to Install pgAdmin4 on windows
Example 2
In the second example, we extract a “great” substring from the “PostgreSQL is great” string.
SELECT SUBSTRING('PostgreSQL is great' FROM 15 FOR 5);
In the above example, I have specified position 15 and length 5 to extract the “great” word from the target string.
Example 3
In another example, if we don’t specify the last length parameter and want to extract the string until the end, starting from the position, then we have to specify only the FROM parameter, as shown below.
SELECT SUBSTRING('PostgreSQL is great' FROM 15);
Case Insensitive
Substring from the Right or End
The substring function in PostgreSQL doesn’t support negative indexing. To extract text from the right or end of the string, you have to use the RIGHT() function. The right function takes a single parameter and lets you specify the starting position from the right side.
For example, if you want to extract the “SQL” word from the end of the string, you can pass the below parameters.
SELECT RIGHT('PostgreSQL is great', 5);
SUBSTRING Last N Letters
Now, if you want to extract only the last N letters from the right side of the string, then you can use the SUBSTRING function with the RIGHT function, as shown below.
SELECT SUBSTRING(RIGHT('PostgreSQL is great', 5), 1, 3);
Substring with Regular Expressions
The substring function also supports regular expressions to extract some part of the string based on the regular expression. We can use the same FROM parameter as shown below.
Syntax
The syntax for using regular expressions is shown below.
SUBSTRING(<target_string> FROM <pattern>);
PostgreSQL supports various regular expression operators. To learn more about operators, you can check out the PostgreSQL official documentation regular expression.
Regular expressions examples using SUBSTRING function
In the below example, I am extracting a ‘SQL’ word from the target string. Here “.” means any letter, and “*” means a letter may appear zero or more times in the target string.
SELECT SUBSTRING('PostgreSQL is great' FROM 'S.*L');
Result: SQL
Note: If PostgreSQL substring function did not found any data using regular expression, it returns NULL.
In another example, I have used the “?” operator to extract part of the string. Here “?” means a letter may appear zero or only once in the target string.
SELECT SUBSTRING('PostgreSQL is great' from 'gr.?.?t')
Result: great
Extract the first N digits from string
To extract the first N digits from the beginning of the string using a regular expression, you can use the below query.
SELECT SUBSTRING('123. My Name is George.' from '^\d{1,5}')
Here, I have used the \d regular expression operator and specified {1,5}, which means a number can be 1 to 5 digits long.
Extract first word using substring function
To extract the first word, use the \w operator, which matches any word character.
SELECT SUBSTRING('My roll number is 42.' from '\w{2,20}')
Extract DATE from the text
To extract dates from text, use the \d operator and specify the length as shown below.
SELECT SUBSTRING('My date of birth is 1984-04-01.' from '\d{4}-\d{2}-\d{2}')
If the date is not formatted, like a year comes in 2 digits or 4 digits or a month comes in 1 or 2 digits, then use the length operator as shown below.
SELECT SUBSTRING('My date of birth is 84-4-01.' from '\d{2,4}-\d{1,2}-\d{2}')
Substring after first occurrence of character
Sometimes we need to extract text after the first instance or occurrence of some characters. To do that in PostgreSQL, we need to use the POSITION() function, as shown below:
SELECT SUBSTRING('age -> 35' FROM POSITION('->' IN 'age -> 35')+ 2)
In a similar way, when you want to extract data after the first space using the SUBSTRING function,. Then you can use the below code:
SELECT SUBSTRING('Name George Yarbrough' FROM POSITION(' ' IN 'Name George Yarbrough')+1)
Convert extracted substring to Integer or Number in PostgreSQL
Sometimes, we need to convert extracted strings to an integer or number. PostgreSQL provides an easy way to do that.
SELECT SUBSTRING('my age is 32.' FROM '\d{1,4}')::integer
As you can see in the highlighted image area, the converted value is not coming as an integer. In the same way, you can convert strings to decimals or any other data type.
Summary
With the PostgreSQL SUBSTRING function, you can extract specified text from data. You can use the RIGHT function to extract a string from the end of the data, or you can use a regular expression to extract the desired text.