Skip to content

Word cloud: Noun descriptions


A word cloud is an image composed of words in a book or large piece of text, where the size of the words in the image is indicative of their frequency of occurence in the text.

This tutorial shows you how to use the noun descriptions in the Vedic Society API to generate a word cloud.

Algorithm

All the path parameters in the Vedic Society API return a response in the same JSON structure.

{
  "nagari": "string",
  "word": "string",
  "description": "string",
  "category": "string"
}

To create the corpus for the word cloud, you need the values from all the description parameters. To do so, you use the categories/{category} path parameter, and fetch the words for all categories one by one.

The following pseudocode shows how to generate the corpus.

category_list = [category1, category2, category3, ...]
where /categories/{category} == <item from category_list>:
    get description
    append to word_list
convert word_list to word_text

After generating the corpus, use your favourite data-visualiser to create a word cloud. The following example code uses the WordCloud for Python package.

Example code in Python

This example uses the WordCloud for Python package.

  1. Create a list of categories available in the API.

    category_list = [..., mountain, place, river, ...]
    
  2. Make a GET call for the first item in category_list. For example, the following code shows a call for the clothing category.

    headers = {
        'accept': 'application/json',
    }
    
    url = "https://api-vs.herokuapp.com/vs/v2/categories/clothing"
    
    response = requests.get(url, headers=headers)
    response_json = json.loads(json.dumps(response.json()))
    
  3. Loop through the returned JSON, pick description, and add it to a list.

    word_list = []
    
    for entry in response_json:
        word_list.append(entry['description'])
    
  4. Make a GET call for the next category on the list, pick the description, and append it to word_list.

  5. Convert the list to a single block of text, where each list item is separated by a single space.

    text = " ".join(word_list)
    
  6. Use pip to install the wordcloud package in your environment, and then use the following code to create the word cloud.

    # generate a word cloud image
    wordcloud = WordCloud().generate(text)
    
    # lower max_font_size
    wordcloud = WordCloud(max_font_size=40).generate(text)
    
    # plot the chart
    plt.figure()
    plt.imshow(wordcloud, interpolation="bilinear")
    plt.axis("off")
    plt.show()
    

Results

You should be able to see a pie chart like this:

pie chart of meters in rig veda

What to do next

You can generate similar word clouds for the several categories separately.

More HowTo-s

See Index.