Find the k most frequent words from data set in Python Last Updated : 07 Jan, 2025 Comments Improve Suggest changes Like Article Like Report The goal is to find the k most common words in a given dataset of text. We'll look at different ways to identify and return the top k words based on their frequency, using Python.Using collections.Countercollections.Counter that works like a dictionary, but its main job is to count how many times each item appears in a list or collection. It makes it simple and fast to count things, check how often they appear, and find the most common ones.Example: Python from collections import Counter # A list of items to count li = ['apple', 'banana', 'orange', 'apple', 'banana', 'apple'] # Create a Counter object cnt = Counter(li) # Display the counts as a dictionary print("Item counts:", cnt) # Access count of a specific item print("Count of apples:", cnt['apple']) OutputItem counts: Counter({'apple': 3, 'banana': 2, 'orange': 1}) Count of apples: 3 Using heapq.nlargestheapq.nlargest method helps quickly find the top k most frequent words by keeping track of just the k most common ones as it goes through the data. This is faster and uses less memory than sorting all the words.Example: Python import heapq from collections import Counter # A list of words to analyze li = ['apple', 'banana', 'orange', 'apple', 'banana', 'apple', 'orange', 'banana', 'grape', 'grape', 'grape', 'grape'] # Step 1: Count the frequency of each word cnt = Counter(li) # Step 2: Find the top k most frequent words using heapq.nlargest k = 2 # Top 2 most frequent words top_k = heapq.nlargest(k, cnt.items(), key=lambda x: x[1]) # Display the results print(cnt) print(f"{k} most frequent words:", top_k) OutputCounter({'grape': 4, 'apple': 3, 'banana': 3, 'orange': 2}) 2 most frequent words: [('grape', 4), ('apple', 3)] Using pandas.value_countspandas.value_counts()is a simple and fast method to find the most frequent words, especially when our data is in a pandas DataFrame. It counts how often each word appears and easily gives us the top k most frequent words.Example: Python import pandas as pd # A list of words to analyze li = ['apple', 'banana', 'orange', 'apple', 'banana', 'apple', 'orange', 'banana', 'grape', 'grape', 'grape', 'grape'] # Step 1: Convert the list to a pandas Series s = pd.Series(li) # Step 2: Use value_counts to count the occurrences cnt = s.value_counts() # Step 3: Extract the top k most frequent words k = 2 # Top 2 most frequent words top_k = cnt.head(k) # Display the results print(cnt) print(f"\n{k} most frequent words:\n", top_k) Outputgrape 4 apple 3 banana 3 orange 2 dtype: int64 2 most frequent words: grape 4 apple 3 dtype: int64 Note: Here dtype represents the data type of the word counts, which is int64 in this case because pandas uses this data type for integer values. Comment More infoAdvertise with us Next Article Find frequency of each word in a string in Python A abhishek1 Follow Improve Article Tags : Python Order-Statistics python-string Practice Tags : python Similar Reads Find frequency of each word in a string in Python Write a python code to find the frequency of each word in a given string. Examples: Input : str[] = "Apple Mango Orange Mango Guava Guava Mango" Output : frequency of Apple is : 1 frequency of Mango is : 3 frequency of Orange is : 1 frequency of Guava is : 2 Input : str = "Train Bus Bus Train Taxi A 7 min read Find the most frequent value in a NumPy array In this article, let's discuss how to find the most frequent value in the NumPy array. Steps to find the most frequency value in a NumPy array: Create a NumPy array.Apply bincount() method of NumPy to get the count of occurrences of each element in the array.The n, apply argmax() method to get the 1 min read Find the Frequency of a Particular Word in a Cell in an Excel Table in Python In this article, we'll look at how to use Python to find the number of times a word appears in a cell of an Excel file. Before we begin with the steps of the solution, the following modules/libraries must be installed. We will use the following sample Excel file to determine the frequency of the inp 5 min read Find the Frequency of a Particular Word in a Cell in an Excel Table in Python In this article, we'll look at how to use Python to find the number of times a word appears in a cell of an Excel file. Before we begin with the steps of the solution, the following modules/libraries must be installed. We will use the following sample Excel file to determine the frequency of the inp 5 min read Find the most repeated word in a text file Python provides inbuilt functions for creating, writing, and reading files. Two types of files can be handled in python, normal text files, and binary files (written in binary language,0s and 1s). Text files: In this type of file, Each line of text is terminated with a special character called EOL ( 2 min read Find Frequency of Characters in Python In this article, we will explore various methods to count the frequency of characters in a given string. One simple method to count frequency of each character is by using a dictionary.Using DictionaryThe idea is to traverse through each character in the string and keep a count of how many times it 2 min read Like