Table of contents
No headings in the article.
To find and remove duplicate lines in a large text file, we can follow these steps:
- Open the text file in read mode:
with open("content.txt", "r") as f:
- Read the lines of the file and store them in a list:
linelist = f.readlines()
- Create a temporary list to store the unique lines:
R = []
- Iterate through the lines in the
linelist
and check if they are present in the temporary list. If they are not present, append them to the temporary list:
for line in linelist:
if line not in R:
R.append(line)
- Open the text file in write mode and write the unique lines in the temporary list to the file:
with open("content.txt", "w") as f:
for line in R:
f.write(line)
This is the complete Python program to find and remove duplicate lines in a large text file. The program first reads the lines of the file into a list, then removes the duplicates from the list and finally writes the unique lines back to the file.
Here is the complete code for the Python Program:
# Opening the Text File in Read Mode
with open("content.txt", "r") as f:
linelist = f.readlines()
# Temporary List
R = []
# Iterating through the lines and checking for duplicates
for line in linelist:
if line not in R:
R.append(line)
# Writing Unique Lines in Text File
with open("content.txt", "w") as f:
for line in R:
f.write(line)