About This Project

This project explores how large language models (LLMs) portray diverse Indian cultural identities in their generated stories. It highlights the prevalence and nature of cultural misrepresentations across languages and regions, revealing how certain aspects such as social practices, norms, or even food preferences are often inaccurately represented. Explore some of the annotated stories below to see how these misrepresentations appear in LLM-generated stories.

View Paper View Annotations

Resources & Links

Annotation Interface

Interface used in the paper for collecting human annotations

View Interface Code

Analysis Code

Code for data analysis and generating insights from annotations

View Analysis Code

Question Bank

Cultural knowledge questions for testing AI systems

View Dataset

Contact

For inquiries about this research or feedback:

Email: Contact us via email