Day 5: SQL Text Manipulation: String Functions, Substring Extraction, and Pattern Matching

Ashimabha Bose
3 min readOct 18, 2023

--

Follow for content on Investing : LinkedIn / Instagram

Follow for content on Technical: LinkedIn

Introduction

SQL string functions are used to manipulate text data in a variety of ways. They can be used to extract substrings, match patterns, and perform other types of text manipulation.

In this blog post, we will share some of the most common SQL string functions, as well as examples of how to use them to extract substrings, match patterns, and work with wildcards.

Common SQL String Functions

The following are some of the most common SQL string functions:

  • SUBSTRING: The SUBSTRING function extracts a substring from a string. The basic syntax of the SUBSTRING function is as follows:
SUBSTRING(string, start_position, length);

The string argument is the string from which you want to extract the substring. The start_position argument is the position of the first character in the substring. The length argument is the length of the substring.

For example, the following query will extract the first three characters from the string “Hello, world!”:

SELECT SUBSTRING('Hello, world!', 1, 3);

This query will return the string “Hel”.

  • INSTR: The INSTR function returns the position of a substring within a string. The basic syntax of the INSTR function is as follows:

SQL

INSTR(string, substring, start_position);

The string argument is the string in which you want to search for the substring. The substring argument is the substring that you are searching for. The start_position argument is the position in the string at which to start the search.

For example, the following query will return the position of the substring “world” within the string “Hello, world!”:

SELECT INSTR('Hello, world!', 'world');

This query will return the number 7, which is the position of the first character in the substring “world”.

  • TRIM: The TRIM function removes whitespace from the beginning and end of a string. The basic syntax of the TRIM function is as follows:
TRIM(string);

The string argument is the string from which you want to remove whitespace.

For example, the following query will remove whitespace from the beginning and end of the string “ Hello, world! “:

SELECT TRIM(' Hello, world! ');

This query will return the string “Hello, world!”.

Substring Extraction

Substring extraction is the process of extracting a substring from a string. Substring extraction can be used to extract specific parts of a string, such as the first name or last name from a customer’s name.

To extract a substring from a string, you can use the SUBSTRING function. The SUBSTRING function takes three arguments: the string from which you want to extract the substring, the start position of the substring, and the length of the substring.

For example, the following query will extract the first name from a customer’s name:

SELECT SUBSTRING(name, 1, INSTR(name, ' '));

This query will extract the substring from the name column starting at position 1 and ending at the position of the first space in the string.

Pattern Matching

Pattern matching is the process of finding all occurrences of a pattern within a string. Pattern matching can be used to find specific data in a string, such as email addresses or phone numbers.

To match a pattern within a string, you can use the LIKE operator. The LIKE operator takes two arguments: the string that you want to search for the pattern in and the pattern that you are searching for.

For example, the following query will find all email addresses in the email column:

SELECT * FROM customers WHERE email LIKE '%@example.com';

This query will return all rows in the customers table where the email the column contains the substring @example.com.

Working with Wildcards

Wildcards are special characters that can be used to represent any character or any number of characters. The two most common wildcards in SQL are the underscore (_) and the percent sign (%).

The underscore (_) wildcard represents any single character. The percent sign (%) wildcard represents any number of characters.

For example, the following query will find all customers with a last name that starts with the letter

--

--

Ashimabha Bose
Ashimabha Bose

Written by Ashimabha Bose

Senior Business Analyst | Power BI | Digital Marketer | Data Analyst | AI Enthusiast