Indian Language Dialect Data Annotation Challenge - 2026

Official Guidelines

Organized by NEXTGENVECTORA DATA INNOVATIONS (OPC) Private Limited

Click here to register

1. Purpose of the Competition

This competition aims to create high-quality annotated datasets for Indian languages and their dialects to support research and development in NLP, Generative AI, and Responsible AI systems.

2. Eligibility

The competition is open to:

3. Important Dates

4. Preferred Annotation Languages

5. Annotation Tasks

6. Annotation Guidelines

7. Prohibited Content

8. Data Usage & Ownership

All submitted annotations will become the intellectual property of NEXTGENVECTORA DATA INNOVATIONS (OPC) Private Limited and will be used strictly for research, educational, and AI model development purposes.

9. Evaluation Criteria

10. Benefits of Participation

11. Certification & Recognition

12. Ethical Commitment

This initiative promotes linguistic diversity, responsible AI development, and ethical data collection practices.

13. Disclaimer

NEXTGENVECTORA DATA INNOVATIONS (OPC) Private Limited reserves the right to modify these guidelines at any time. Any violation may result in suspension or termination of access.

14. Sponsorships & Collaborations

NEXTGENVECTORA DATA INNOVATIONS (OPC) Private Limited is happy to collaborate and welcome sponsorships from colleges, universities, research institutions, and industries interested in advancing Indian language technologies, dialect preservation, and responsible AI development.

Highlights for Sponsors: