Strings are everywhere in programming. They are powerful and flexible. But what if you only need a part of a string? In this article, we’ll explore how to get a substring of a string in Python with practical examples.
Before diving into Python substrings, let’s understand strings first. Strings are sequences of characters, and each character has a position, called its index. An example could be “Hello World!”.
Code
In the sample code, the indexes of the string “Hello World!!” are as follows: H(0), e(1), l(2), l(3), o(4), space(5), w(6), o(7), r(8), l(9), d(10), !(11). We can get any part of a string by using these indexes.
string = "Hello World!" print(string[0])
Output
H
What is a Substring?
A substring is any part of a string. If you have the phrase “Syntax Scenarios”, the words “Syntax” and “Scenarios” are substrings. Even smaller parts, like “tax,” are substrings.
Different Methods to Get a Substring of a String in Python
Extracting a substring isolates specific parts of a string for further processing. For example, if you extract a user’s name from a string, you can dynamically personalize messages by inserting variables into a string.
Python offers different ways to extract a part of a string. Let’s discuss all of them one-by-one.
Extract a Substring Using Slicing
Slicing is like cutting a piece out of a cake. Imagine you have a whole cake, and you want to take a slice starting from a specific point and ending at another. You simply choose where to start (the beginning of your slice) and where to end (the ending of your slice).
In Python, slicing works the same way. You specify the starting and ending positions of the part of the string you want, and Python cuts out that portion for you.
Syntax
subString = origialString[start:end]
Code
We have a string “Syntax Scenarios” and we want to slice the word “Syntax” out of the original string. For that purpose, we will specify the start position 0 and the end position 6. Although the end index is 6, it is not included in the result. This is a key point to remember.
originalString = "Syntax Scenarios" # Original string subString = originalString[0:6] # Extracting "Syntax" print(subString)
Output
Syntax
Get Left or Right Substrings
Sometimes, you only need the beginning or the end of a string, rather than a specific section. Python makes this task easy by allowing you to omit the start or end value in a slice.
Get the Left Part of a String
To get the first portion of a string, you only need to provide the end index. Python will assume the slicing starts at the beginning of the string.
Code
Below, string[:20]
tells Python to start at the beginning and stop just before index 20. This results in the first 20 characters being extracted which is “Programming is great”.
string = "Programming is great with Syntax Scenarios!" left_part = string[:20] # Extract the first 20 characters print(left_part)
Output
Programming is great
Get the Right Part of a String
To grab the last part of a string, specify only the start index. Python will extract all characters from that position to the end of the string.
Code
In this code, Python extracts everything from index 26 to the end of the string and displays the output “Syntax Scenarios!”.
string = "Programming is great with Syntax Scenarios!" right_part = string[26:] # Get everything from index 26 onwards print(right_part)
Output
Syntax Scenarios
Extract Substrings from the End Using Negative Indexing
Negative indexing is like reading a book from the last page to the first. Instead of starting from the front, you count backwards from the end of the book. The last page is -1, the second-to-last is -2, and so on.
In Python, negative indexing works the same way. You can start from the end of the string and count backwards to extract the part you need. This way, you don’t need to know the total length of the string to get the last characters. You can just count backwards and grab what you need!
Code
In this example of code, string[-9:]
tells Python to start at the 9th character from the end (which is “S” in “Scenarios”) and go all the way to the end of the string. As a result, the last 9 characters “Scenarios” are extracted.
string = "Welcome to Syntax Scenarios" substring = string[-9:] # Extract the last 9 characters print(substring)
Output
Scenarios
Find all Substrings of a String
Sometimes, you may need to find all possible substrings from a given string, so this means you could end up with multiple substrings. To get all substrings, you can use nested loops.
Code
There are two loops in this block of code: outer loop & inner loop. The outer loop sets the starting index of the substring and the inner loop changes the ending index, generating all possible substrings.
string = "Code" for start in range(len(string)): # Generate all substrings for end in range(start + 1, len(string) + 1): print(string[start:end])
Output
C Co Cod Code o od ode d de e
Using the split() Method
The split()
method works by dividing a string into smaller parts based on a specific delimiter (separator) you provide. It creates a list where each part is an item, and you can select the part you need.
Code
In this example, the separator is a space, splitting the string into “Syntax” and “Scenarios.” You can then select “Scenarios” using its index which is 1.
origialString = "Syntax Scenario" substring = origialString.split(" ")[1] # Split the string based on space print(substring)
Output
Scenarios
Get a Substring Using Regular Expressions
For more advanced tasks, Python’s re
module allows you to extract substrings that match specific patterns. This is especially useful for tasks like identifying emails, phone numbers, or other structured text.
Code
In the given code, the output example@syntaxscenarios.com is extracted from the text using a regular expression that identifies email patterns based on word boundaries, characters, and symbols.
import re emailString = "My email is example@syntaxscenarios.com" match = re.search(r'\b[\w\.-]+@[\w\.-]+\.\w+\b', emailString) # Identifies text that looks like an email address if match: print(match.group())
Output
example@syntaxscenarios.com
Using a Custom Function
A custom function allows you to define specific rules for extracting substrings. The core logic of the function lies in constructing the substring manually. It initializes an empty string result
and uses a for
loop to iterate over the characters in the string, starting from the start
index. The loop runs until either the specified number of characters (length
) has been added or the end of the string is reached. During each iteration, the character at the current index is appended to result
.
For example, given the input "Syntax Scenarios"
, with start = 7
and length = 5
, the function iterates over indices 7 through 11, appending characters to form "Scena"
, which is returned as the result.
Code
As you can notice in the example, the function extracts characters from index 7 to index 16. And we achieve the same result as before “Scenarios.”
def custom_substring(string, start, length): # Handle invalid inputs if start < 0 or start >= len(string): return "Invalid start index." if length < 0: return "Length must be non-negative." # Manually build the substring result = "" for i in range(start, min(start + length, len(string))): # Iterate over the range result += string[i] return result # Example usage text = "Syntax Scenarios" substring = custom_substring(text, 7, 5) print(substring) # Output: "World"
Output
Scena
Find the Location of a Substring
Knowing where a substring appears in a string can be quite helpful, especially when you’re processing text. Python provides a built-in method, find()
that helps you find the location of substrings.
Using find()
The find()
method is the easiest way to find the index of a substring. It returns the index of the first occurrence of the substring. If it isn’t found, it returns -1.
Code
As you can see, string.find("Syntax")
searches for the substring “Syntax” in the string text. The method returns 11 which is the index of the first letter of the word “Syntax”.
string = "Welcome to Syntax Scenarios" substring = "Syntax" # Substring to find position = string.find(substring) # Find the position of "Syntax" print(position)
Output
11
Note: Truncating a string is often confused with getting a substring. Truncating is intended to make it fit within a visual or character limit, often adding “…” to indicate it’s cut off. Substring extraction, on the other hand, is used to isolate specific parts of a string for data manipulation or logical purposes without altering its intent or meaning.
Real-Life Use Cases for Extracting Substrings
1. Text Analysis
In field language processing, substrings help break down text into smaller parts. For example, you might use them to find specific words or count how often certain phrases appear in a text.
2. Data Parsing
When working with organized data or logs, you may need to pull out specific pieces of information. Substrings help you extract things like names, emails, or other details from a line of text.
3. URL Handling
If you’re working with websites or APIs, substrings help you grab specific parts of a URL, like the website’s name or its path. This makes it easier to handle and process URLs.
4. User Input Validation
Substrings are also used to check or format what users type. For example, you might use them to pull out a country code from a phone number or check if something starts with a certain word or letter.
Conclusion
Getting the substring of a string in Python is a helpful skill for many tasks. Whether working with data or processing user input, knowing how to extract substrings makes your work easier. Python provides simple ways to do this, like slicing, negative indexing, and various built-in methods. These methods let you grab just the part of the string you need, whether it’s from the start, the end, or anywhere in between.
For more articles on Python programming with simple and beginner-friendly analogies, check out our dedicated Python tutorial series by Syntax Scenario.