Skip to content

danielrosehill/Hebrew-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Hebrew-LLMs

A curated collection of Hebrew language AI models available on Hugging Face as of April 17, 2025. This repository aims to provide a good starting point for anyone looking to work with Hebrew language AI models.

About Hebrew Language and AI

Hebrew is a Semitic language with approximately 9 million native speakers worldwide, primarily in Israel. Despite its relatively small speaker base, Hebrew presents several interesting characteristics for AI research:

  • Modern vs Biblical Hebrew: There are significant differences between Modern Hebrew and Biblical Hebrew, with specialized models developed for biblical text analysis.

  • Punctuation Challenges: Modern written Hebrew typically lacks extensive punctuation, creating a need for specialized models that can infer and add appropriate punctuation.

  • Technological Hub: Israel is a renowned center for technology and AI research, making Hebrew language AI models particularly interesting from an experimental and innovation perspective.

  • Rich Linguistic Structure: Hebrew's non-Latin script, right-to-left writing system, and complex morphology present unique challenges for language models.

These factors make Hebrew language AI development both challenging and valuable, with applications ranging from biblical text analysis to modern NLP tasks.

Table of Contents

Large Language Models (LLMs)

Mistral Fine-tunes

Model Link
Hebrew-Mistral-7B Hebrew-Mistral-7B
Hebrew-Mistral-7B-200K Hebrew-Mistral-7B-200K
Hebrew-Mistral-7B_Chat-GGUF Hebrew-Mistral-7B_Chat-GGUF
Hebrew-Mistral-7B-Instruct-v0.1-GGUF Hebrew-Mistral-7B-Instruct-v0.1-GGUF

Mixtral Fine-tunes

Model Link
Hebrew-Mixtral-8x22B Hebrew-Mixtral-8x22B

Gemma/Google Fine-tunes

Model Link
Hebrew-Gemma-11B Hebrew-Gemma-11B
Hebrew-Gemma-11B-Instruct Hebrew-Gemma-11B-Instruct
Hebrew-Gemma-11B-V2-mlx-4bit Hebrew-Gemma-11B-V2-mlx-4bit

Niche Text Models

Summarization

hebrew-summarization-llm

Biblical

hebrew_bible_ai

Metaphor Detection

hebert-finetuned-hebrew-metaphor

Translation

t5-hebrew-translation english-hebrew-translation

Offensive Language

Offensive-Hebrew

Punctuation

hebrew_punctuation

Sentiment Analysis

xlm-r_hebrew_sentiment

Specialized Language Models

Hebrew to SQL

Llama-3.1-8b-Hebrew2SQL

Hebrew Medical Terms (NER)

hebrew_medical_ner_v5

ASR Models (Speech Recognition)

wav2vec2-large-xlsr-53-hebrew Whisper_hebrew_medium

TTS Models (Text-to-Speech)

Note: This section will be populated with Hebrew TTS models in the future.

Benchmarks and Leaderboards

The Hebrew LLM Leaderboard provides valuable insights into the performance of various models on Hebrew language tasks:

Resource Link
Hebrew LLM Leaderboard Hebrew-LLM-Leaderboard
Hebrew Question Answering Dataset Hebrew-QA-Dataset

Leaderboard Insights

An interesting observation from the leaderboard is that large multilingual LLMs (like Mistral and Meta-Llama models) generally outperform specialized Hebrew models due to their significantly larger parameter counts. However, specialized Hebrew models still appear on the leaderboard and perform reasonably well considering their size constraints.

The benchmark evaluates models across several categories:

  • SNLI (Natural Language Inference)
  • QA (Question Answering)
  • TLNLS (Text Classification)
  • Sentiment Analysis
  • Winograd Schema Challenge
  • Translation
  • Israeli Trivia (a unique category testing cultural and local knowledge)

This comprehensive evaluation provides a holistic view of model capabilities in the Hebrew language context.

Organizations to Follow

Organization Link
Dicta Dicta
MAFAT (National Natural Language Processing Plan Of Israel) MAFAT

Other Interesting Projects

HebrewManuscriptsMNIST

Additional Links

Hebrew-Models-Collection

Worthy Follow

yam-peleg

Reading

Resource Link
Best LLM for Hebrew Classification Best-LLM-for-Hebrew-Classification
Hebrew LLM Paper Hebrew-LLM-Paper
Hebrew Model Sentiment Analysis Hebrew-Model-Sentiment-Analysis
Huggingface Hebrew Leaderboard Huggingface-Hebrew-Leaderboard
Hebrew GPT Neo XL Hebrew-GPT-Neo-XL

Resources

About

A pathfinder repo (index) to some Hebrew language LLMs on Hugging Face

Topics

Resources

Stars

Watchers

Forks