Indian Flag
Government Of India
A-
A
A+

Bhashini - IndicNER

IndicNER is a multilingual Named Entity Recognition model fine-tuned on 11 Indian languages to identify named entities in text

  • Digital India BHASHINI Division
    Digital India BHASHINI Division
  • BHASHINI_shailendra
    BHASHINI_shailendra

About Model

IndicNER is a state-of-the-art multilingual Named Entity Recognition (NER) model developed by Bhashini. It is designed to recognize and classify named entities such as names of persons, organizations, locations, dates, and more from text in 11 Indian languages:
Hindi, Bengali, Tamil, Telugu, Gujarati, Punjabi, Marathi, Assamese, Kannada, Malayalam and Oriya.

Training Dataset:
The model is fine-tuned using a large corpus derived from publicly available Indian NER datasets and human-annotated test sets, ensuring high accuracy across different languages. Additionally, it has been trained on data sourced from the Samanantar Corpus, India's largest parallel corpus, to enhance its contextual understanding. The base model used for fine-tuning is BERT-base-multilingual-uncased, which allows it to capture linguistic nuances effectively.

Use Cases:
IndicNER can be used for a wide range of Natural Language Processing (NLP) applications, including:

1. Automated document processing – Extracting key entities from government, legal, and business documents.
2. Chatbots and virtual assistants – Enhancing conversational AI by identifying user queries related to people, places, and organizations.
3. News and content analysis – Automatically tagging and categorizing entities in multilingual news articles.
4. Healthcare and medical records – Identifying patient details and medical terms for structured data extraction.

For more details and implementation, visit: https://huggingface.co/ai4bharat/IndicNER.



Bhashini - IndicNER

Metadata Metadata

MIT

AI4Bharat

Named Entity Recognition (NER) Model

Open

Digital India BHASHINI Division

Sector Agnostic

05/03/25 15:23:12

Admin

591.28 MB

Activity Overview Activity Overview

  • Downloads 9
  • Views 302
  • File Size 591.12 MB

Tags Tags

  • Multilingual
  • Foreigners
  • NLP
  • Transformer
  • Token Classification
  • Pytorch
  • Samanantar
  • Bert
  • NER

License Control License Control

MIT

Version Control Version Control

FolderVersion 2(591.12 MB)
  • admin·1 month(s) ago
    • .zip
      IndicNER.zip