- What disciplines are taught? (50% done)
- Which of them are not covered by Urait courses? (50% done)
- How to fill these gaps? (0% done)
To answer the question "what courses does Urait have for a given university specialization?" we:
- search for the study plans (учебные планы) for the specialization, via google search
- parse the disciplines from the study plan page, using Gemini 2.5 Flash
- search for work programs (рабочие программы дисциплин) for each discipline
- parse the topics from the work program page, using Gemini 2.5 Flash
- embed the topics using Gemini embeddings
- embed the Urait courses using Gemini embeddings
- match the topics to the courses using cosine similarity
- for top 5 course matches, use Gemini 2.5 Flash to decide whether the course can be used to teach the discipline
- match the study plans to the universities they came from, using the urls
- get ~10-100 study plans per specialization, to approximate the popularity of the disciplines
- use the popularity to sort the holes -- disciplines without matching courses
- run
pip install -r requirements.txt - download
project_subjects.csvfrom drive. - make a
.envfile withGOOGLE_API_KEY=your_keyandSERPER_API_KEY=your_key.