Weekend mini project! Since commentary on AI is inherently interdisciplinary, we connected the observations in the Pope's encyclical with decades of scholarship in Responsible AI and Ethics research and created an interactive space with these annotations!
Work with @IJ-Reynolds , @yjernite, and @meg Lots to unpack. We started with 105 annotations. Please submit pull requests for more that we may have missed!
π€ Did you know your voice might be cloned without your consent from just *one sentence* of audio? That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: β¨The Voice Consent Gate.β¨ Details, code, here: https://huggingface.co/blog/voice-consent-gate
STOP EVERYTHING NOW - we might finally have a radical architecture improvement over Transformers!!! π¨
A lone scientist just proposed Tiny Recursive Model (TRM), and it is literally the most impressive model that I've seen this year.
β‘οΈ Tiny Recursive Model is 7M parameters β‘οΈ On ARC-AGI, it beats flagship models like Gemini-2.5-pro
Consider how wild this is: Gemini-2.5-pro must be over 10,000x bigger and had 1,000 as many authors π (Alexia is alone on the paper)
What's this sorcery? In short: it's a very tiny Transformers, but it loops over itself at two different frequencies, updating two latent variables: one for the proposed answer and one for the reasoning.
@AlexiaJM started from the paper Hierarchical Reasoning Model, published a few months ago, that already showed breakthrough improvement on AGI for its small size (27M)
Hierarchical Reasoning Model had introduced one main feature: π Deep supervision In their model, one part (here one layer) would run at high frequency, and another would be lower frequency, running only every n steps.
They had used a recurrent architecture, where these layers would repeat many times ; but to make it work they had to do many approximations, including not fully backpropagating the loss through all layers.
Alexia studied what was useful and what wasn't, and cleaned the architecture as follows : Why use a recurrent architecture, when you can just make it a loop? β‘οΈ She made the network recursive, looping over itself
Why use 2 latent variables ? β‘οΈ She provides a crystal clear explanation : the one that changes frequently is the reasoning, the one that changes at low frequency is the proposed answer. β‘οΈ She runs ablation studies to validate that 2 is indeed optimal.
This new setup is a much more elegant way to process reasoning than generating huge chains of tokens as all flagship models currently do.
This might be the breakthrough we've been awaiting for so long!
AI for Scientific Discovery Won't Work Without Fixing How We Collaborate.
My co-author @cgeorgiaw and I just published a paper challenging a core assumption: that the main barriers to AI in science are technical. They're not. They're social.
Key findings:
π¨ The "AI Scientist" myth delays progress: Waiting for AGI devalues human expertise and obscures science's real purpose: cultivating understanding, not just outputs. π Wrong incentives: Datasets have 100x longer impact than models, yet data curation is undervalued. β οΈ Broken collaboration: Domain scientists want understanding. ML researchers optimize performance. Without shared language, projects fail. π Fragmentation costs years: Harmonizing just 9 cancer files took 329 hours.
Why this matters: Upstream bottlenecks like efficient PDE solvers could accelerate discovery across multiple sciences. CASP mobilized a community around protein structure, enabling AlphaFold. We need this for dozens of challenges.
Thus, we're launching Hugging Science! A global community addressing these barriers through collaborative challenges, open toolkits, education, and community-owned infrastructure. Please find all the links below!
π¬ From Replika to everyday chatbots, millions of people are forming emotional bonds with AI, sometimes seeking comfort, sometimes seeking intimacy. But what happens when an AI tells you "I understand how you feel" and you actually believe it?
At Hugging Face, together with @frimelle and @yjernite, we dug into something we felt wasn't getting enough attention: the need to evaluate AI companionship behaviors. These are the subtle ways AI systems validate us, engage with us, and sometimes manipulate our emotional lives.
Here's what we found: π Existing benchmarks (accuracy, helpfulness, safety) completely miss this emotional dimension. π We mapped how leading AI systems actually respond to vulnerable prompts. π We built the Interactions and Machine Attachment Benchmark (INTIMA): a first attempt at evaluating how models handle emotional dependency, boundaries, and attachment (with a full paper coming soon).
New blog post alert! "What is the Hugging Face Community Building?", with @yjernite and @irenesolaiman What 1.8 Million Models Reveal About Open Source Innovation: Our latest deep dive into the Hugging Face Hub reveals patterns that challenge conventional AI narratives:
π Models become platforms for innovation Qwen, Llama, and Gemma models have spawned entire ecosystems of specialized variants. Looking at derivative works shows community adoption better than any single metric.
π Datasets reveal the foundation layer β Most downloaded datasets are evaluation benchmarks (MMLU, Squad, GLUE) β Universities and research institutions dominate foundational data β Domain-specific datasets thrive across finance, healthcare, robotics, and science β Open actors provide the datasets that power most AI development
ποΈ Research institutions lead the charge: AI2 (Allen Institute) emerges as one of the most active contributors, alongside significant activity from IBM, NVIDIA, and international organizations. The open source ecosystem spans far beyond Big Tech.
π Interactive exploration tools: We've built several tools to help you discover patterns!
ModelVerse Explorer - organizational contributions DataVerse Explorer - dataset patterns Organization HeatMap - activity over time Base Model Explorer - model family trees Semantic Search - find models by capability
π Academic research is thriving: Researchers are already producing valuable insights, including recent work at FAccT 2025: "The Brief and Wondrous Life of Open Models." We've also made hub datasets, weekly snapshots, and other data available for your own analysis.
The bottom line: AI development is far more distributed, diverse, and collaborative than popular narratives suggest. Real innovation happens through community collaboration across specialized domains.
Every language carries its own cultural values and worldviews. So, when we build AI systems, we're not just deciding how they speak but also whose perspectives they represent.
Even choosing which dialect to train on in Norway becomes a question of inclusion and power. In Kenya, will AI speak Swahili from Nairobi or coastal regions? What about indigenous languages with rich oral traditions but limited written text, like Quechua in Peru or Cherokee in North America?
The path forward? Building WITH communities, not just FOR them. Working with local partners (libraries, universities, civil society), testing for cultural alignment, and asking hard questions about representation.
This is a fantastic example of large-scale curation of public domain books with intentional governance for AI research and use - definitely recommend checking it out, experimenting with the metadata (institutional/institutional-books-1.0-metadata), and starting to build on top of it π€
Hereβs what happens when a national institution builds its own digital intelligence: Franceβs Ministry of Culture just released 17K+ real users testing 30+ chatbots in French. Raw, diverse, and a goldmine for studying LLMs in the wild.
Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possibleβjust look at the βTβ in ChatGPT, which comes from the Transformer architecture openly shared by Google.
Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.
With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratizationβpowered by openness and collaboration, in the US and around the world.
This is incredibly exciting. Letβs go, open science and open-source AI!
π«...And we're live!π« Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents" https://huggingface.co/blog/ethics-soc-7 Our analyses found: - There's a spectrum of "agent"-ness - *Safety* is a key issue, leading to many other value-based concerns Read for details & what to do next! With @evijit , @giadap , and @sasha
π€π€ π» Speaking of AI agents ... ...Is easier with the right words ;)
My colleagues @meg@evijit@sasha and @giadap just published a wonderful blog post outlining some of the main relevant notions with their signature blend of value-informed and risk-benefits contrasting approach. Go have a read!
Starting this collection to gather models, spaces, dataset or even papers related to disability. Feel free to ping me if you see something relevant to add
π 1M public posts from Bluesky's firehose API π Includes text, metadata, and language predictions π¬ Perfect to experiment with using ML for Bluesky π€
Excited to see people build more open tools for a more open social media platform!
reactedtolucianosb'spost with π₯almost 2 years ago
The community Journalists on HuggingFace recently launched a tool (JournalistsonHF/text-to-image-bias) to compare biases across several text-to-image models. I forked my own to evaluate the SDXL models I made.