How to Get a Substring of a String in Python?

how to get a substring from a string in Python?

Strings are everywhere in programming. They are powerful and flexible. But what if you only need a part of a string? In this article, we’ll explore how to get a substring of a string in Python with practical examples.

Before diving into Python substrings, let’s understand strings first. Strings are sequences of characters, and each character has a position, called its index. An example could be “Hello World!”.

How to Get a Substring of a String in Python
Each character of the string “Hello World!” with its respective index

Code

In the sample code, the indexes of the string “Hello World!!” are as follows: H(0), e(1), l(2), l(3), o(4), space(5), w(6), o(7), r(8), l(9), d(10), !(11). We can get any part of a string by using these indexes.

string = "Hello World!"

print(string[0])

Output

H

What is a Substring?

A substring is any part of a string. If you have the phrase “Syntax Scenarios”, the words “Syntax” and “Scenarios” are substrings. Even smaller parts, like “tax,” are substrings.

Different Methods to Get a Substring of a String in Python

Extracting a substring isolates specific parts of a string for further processing. For example, if you extract a user’s name from a string, you can dynamically personalize messages by inserting variables into a string.

Python offers different ways to extract a part of a string. Let’s discuss all of them one-by-one.

Extract a Substring Using Slicing

Slicing is like cutting a piece out of a cake. Imagine you have a whole cake, and you want to take a slice starting from a specific point and ending at another. You simply choose where to start (the beginning of your slice) and where to end (the ending of your slice).

In Python, slicing works the same way. You specify the starting and ending positions of the part of the string you want, and Python cuts out that portion for you.

How to Get a Substring of a String in Python
A comparison between Cake Slicing and String Slicing

Syntax

subString = origialString[start:end]

Code

We have a string “Syntax Scenarios” and we want to slice the word “Syntax” out of the original string. For that purpose, we will specify the start position 0 and the end position 6. Although the end index is 6, it is not included in the result. This is a key point to remember.

originalString = "Syntax Scenarios" 	# Original string
subString = originalString[0:6]         # Extracting "Syntax"

print(subString)

Output

Syntax

Get Left or Right Substrings

Sometimes, you only need the beginning or the end of a string, rather than a specific section. Python makes this task easy by allowing you to omit the start or end value in a slice.

Get the Left Part of a String

To get the first portion of a string, you only need to provide the end index. Python will assume the slicing starts at the beginning of the string.

Code

Below, string[:20] tells Python to start at the beginning and stop just before index 20. This results in the first 20 characters being extracted which is “Programming is great”.

string = "Programming is great with Syntax Scenarios!"  
left_part = string[:20]                 # Extract the first 20 characters 

print(left_part)

Output

Programming is great
Get the Right Part of a String

To grab the last part of a string, specify only the start index. Python will extract all characters from that position to the end of the string.

Code

In this code, Python extracts everything from index 26 to the end of the string and displays the output “Syntax Scenarios!”.

string = "Programming is great with Syntax Scenarios!"
right_part = string[26:]                # Get everything from index 26 onwards

print(right_part)

Output

Syntax Scenarios

Extract Substrings from the End Using Negative Indexing

Negative indexing is like reading a book from the last page to the first. Instead of starting from the front, you count backwards from the end of the book. The last page is -1, the second-to-last is -2, and so on.

In Python, negative indexing works the same way. You can start from the end of the string and count backwards to extract the part you need. This way, you don’t need to know the total length of the string to get the last characters. You can just count backwards and grab what you need!

How to Get a Substring of a String in Python
Reading a book from the start shows Positive Indexing and from the end shows Negative Indexing

Code

In this example of code, string[-9:] tells Python to start at the 9th character from the end (which is “S” in “Scenarios”) and go all the way to the end of the string. As a result, the last 9 characters “Scenarios” are extracted.

string = "Welcome to Syntax Scenarios"  
substring = string[-9:]   				# Extract the last 9 characters 

print(substring)

Output

Scenarios

Find all Substrings of a String

Sometimes, you may need to find all possible substrings from a given string, so this means you could end up with multiple substrings. To get all substrings, you can use nested loops.

Code

There are two loops in this block of code: outer loop & inner loop. The outer loop sets the starting index of the substring and the inner loop changes the ending index, generating all possible substrings.

string = "Code"

for start in range(len(string)):         # Generate all substrings
    for end in range(start + 1, len(string) + 1):  
        print(string[start:end])

Output

C
Co
Cod
Code
o
od
ode
d
de
e

Using the split() Method

The split() method works by dividing a string into smaller parts based on a specific delimiter (separator) you provide. It creates a list where each part is an item, and you can select the part you need.

Code

In this example, the separator is a space, splitting the string into “Syntax” and “Scenarios.” You can then select “Scenarios” using its index which is 1.

origialString = "Syntax Scenario"
substring = origialString.split(" ")[1]		# Split the string based on space

print(substring)

Output

Scenarios

Get a Substring Using Regular Expressions

For more advanced tasks, Python’s re module allows you to extract substrings that match specific patterns. This is especially useful for tasks like identifying emails, phone numbers, or other structured text.

Code

In the given code, the output example@syntaxscenarios.com is extracted from the text using a regular expression that identifies email patterns based on word boundaries, characters, and symbols.

import re

emailString = "My email is example@syntaxscenarios.com"
match = re.search(r'\b[\w\.-]+@[\w\.-]+\.\w+\b', emailString)		# Identifies text that looks like an email address

if match:
    print(match.group())

Output

example@syntaxscenarios.com

Using a Custom Function

A custom function allows you to define specific rules for extracting substrings. The core logic of the function lies in constructing the substring manually. It initializes an empty string result and uses a for loop to iterate over the characters in the string, starting from the start index. The loop runs until either the specified number of characters (length) has been added or the end of the string is reached. During each iteration, the character at the current index is appended to result.

For example, given the input "Syntax Scenarios", with start = 7 and length = 5, the function iterates over indices 7 through 11, appending characters to form "Scena", which is returned as the result.

Code

As you can notice in the example, the function extracts characters from index 7 to index 16. And we achieve the same result as before “Scenarios.”

def custom_substring(string, start, length):
    # Handle invalid inputs
    if start < 0 or start >= len(string):
        return "Invalid start index."
    if length < 0:
        return "Length must be non-negative."
    
    # Manually build the substring
    result = ""
    for i in range(start, min(start + length, len(string))):  # Iterate over the range
        result += string[i]
    
    return result

# Example usage
text = "Syntax Scenarios"
substring = custom_substring(text, 7, 5)
print(substring)  # Output: "World"

Output

Scena

Find the Location of a Substring

Knowing where a substring appears in a string can be quite helpful, especially when you’re processing text. Python provides a built-in method, find() that helps you find the location of substrings.

Using find()

The find() method is the easiest way to find the index of a substring. It returns the index of the first occurrence of the substring. If it isn’t found, it returns -1.

Code

As you can see, string.find("Syntax") searches for the substring “Syntax” in the string text. The method returns 11 which is the index of the first letter of the word “Syntax”.

string = "Welcome to Syntax Scenarios"
substring = "Syntax"    				# Substring to find

position = string.find(substring)  		# Find the position of "Syntax"
print(position)

Output

11

Note: Truncating a string is often confused with getting a substring. Truncating is intended to make it fit within a visual or character limit, often adding “…” to indicate it’s cut off. Substring extraction, on the other hand, is used to isolate specific parts of a string for data manipulation or logical purposes without altering its intent or meaning.

Real-Life Use Cases for Extracting Substrings

1. Text Analysis

In field language processing, substrings help break down text into smaller parts. For example, you might use them to find specific words or count how often certain phrases appear in a text.

2. Data Parsing

When working with organized data or logs, you may need to pull out specific pieces of information. Substrings help you extract things like names, emails, or other details from a line of text.

3. URL Handling

If you’re working with websites or APIs, substrings help you grab specific parts of a URL, like the website’s name or its path. This makes it easier to handle and process URLs.

4. User Input Validation

Substrings are also used to check or format what users type. For example, you might use them to pull out a country code from a phone number or check if something starts with a certain word or letter.

Conclusion

Getting the substring of a string in Python is a helpful skill for many tasks. Whether working with data or processing user input, knowing how to extract substrings makes your work easier. Python provides simple ways to do this, like slicing, negative indexing, and various built-in methods. These methods let you grab just the part of the string you need, whether it’s from the start, the end, or anywhere in between.

For more articles on Python programming with simple and beginner-friendly analogies, check out our dedicated Python tutorial series by Syntax Scenario.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top