Dagster integration for MariaDB – 2nd place in MariaDB BangPypers Hackathon 2025

Ria and Revanth interviewed about the MariaDB Hackathon

Part of blog series on MariaDB BangPypers Hackathon, September – November 2025. Summary: Winners announced at BangPypers meetup | Blogs: Announcement at BangPypers Meetup | Quality Idea Submissions | Ideation phase closed | Impressive stats | Top ideas | How to Succeed | MariaDB Hackathon AMA slides | Winner interviews: Apache Airflow Integration #1 | Dagster Integration #2 | Query Optimizer for MariaDB Vector, Innovation #1 | Multimodal Embeddings for MariaDB Vector, Innovation #2 | Platform: mariadb-python.hackerearth.com | For more hackathons: MariaDB’s hackathon project page

We recently announced the winners of the MariaDB Hackathon at the BangPypers meetup in Bengaluru. We are now interviewing the winner submissions, continuing with the Integration track second place winner.

We sat down with Ria Kulkarni and Revanth Sreeram Ambati who developed a Dagster integration for MariaDB, bringing modern data orchestration capabilities to the MariaDB ecosystem. They were interviewed by Robert Silén, Community Advocate and Kaj Arnö, Executive Chairman of MariaDB Foundation.

For the recorded interview, watch it on Youtube, or read the summary below. 

First, let us repeat a short introduction on the topic of data orchestration that we also shared in the interview with the Airflow Integration track winner. 

Introducing Data Orchestration Frameworks (for non-Data Engineers)

Before we move to the interview, it’s worth introducing the concept of data orchestration frameworks — especially for readers who are not data engineers.

Put simply, these frameworks help you organise and run complex chains of data tasks — a bit like turning a handful of individual scripts into a well-rehearsed orchestra. Each instrument (or task) plays in the right order, with the right timing, so that the whole data pipeline performs smoothly and predictably. You could think of it as “scripting made smarter” — scripts adapted for today’s large-scale, cloud-based data environments.

Some of the most widely used data orchestration frameworks include Apache Airflow, Dagster, Prefect, and Luigi. Each helps engineers define, schedule, and monitor workflows so that data can move and transform reliably across increasingly complex systems.

Who are you and what’s your background?

Ria: I’m a Software Engineer at McKesson Compile. I’ve been working full-time for about six months – I recently graduated from college and I’m currently having a great time figuring out my place in the tech world.

Revanth: I’m a Software Engineer at Amadeus. I also recently graduated from PES University in Bangalore. My tech stack generally revolves around Python. 

Do you have any previous hackathon experience?

Revanth: Yes, we both have participated in quite a few hackathons before, mostly during college. This was our first time working together in a hackathon, and it was a great experience collaborating on something we’re both passionate about.

How did you find the MariaDB hackathon and what caught your interest?

Ria: I attended the BangPypers meetup where the hackathon was announced. The hackathon ideation phase was longer than hackathons where you need to get everything done in 24 hours, so that gave us a chance to explore the space and figure out our interests. 

What was your first experience with MariaDB?

Ria: We had explored MySQL as part of our courses in college and university. I had heard about MariaDB as an alternative to MySQL, but this hackathon was my first time really diving deep into its features and capabilities. Learning about its performance improvements and unique features was eye-opening.

Revanth: Similar to Ria, we learnt MySQL in school. I knew about MariaDB but hadn’t worked with it. I learnt it is an open source version of MySQL that is faster and has more features. 

Why did you choose to integrate MariaDB with Dagster?

Ria: As a Software Engineer at McKesson Compile, I deal with a lot of data pipeline orchestrations. At work, we currently use Prefect for task-based orchestrations, but I was exploring alternatives because we’ve encountered scalability issues when running around 500 jobs within a second.

When I was looking for alternatives, I came across Dagster and did a small POC project on the side. When the hackathon came up, I thought it would be really interesting to combine both learning opportunities. If I was already learning Dagster anyway, it would be cool for me to learn two things at once – both Dagster and MariaDB – integrate them, join those two dots, and have a more comprehensive project to show for what I’d been working on over the past few months.

How does MariaDB compare to Postgres for data orchestration workflows?

Ria: I wouldn’t say I have extensive exposure to Postgres, though we do use it at work primarily because we’re Django-based and it makes ORM very easy for us.

However, when I was researching for this project, I found that MariaDB has some extra features that I think are particularly well-suited for data orchestration environments. Things like the different storage engines and Galera clustering for multi-master setups – I don’t think Postgres would perform as well in those specific cases.

In an environment like Dagster, where all these assets come together and you have multiple reads and writes happening throughout the process of the pipeline, I feel like this is a space where MariaDB can really shine.

What does your hackathon submission do?

Revanth: We’ve integrated Dagster with MariaDB, and our main goal from the beginning was to make sure the integration has all the features specific to MariaDB while working seamlessly with Dagster’s capabilities. We focused on integrating features from both ends to figure out what would work best for this specific combination.

One of the key features we implemented is partitioned asset support – both from the Dagster side with software-defined assets and partitions, and also native partitioning on MariaDB. We’ve also included other features that work generically with MariaDB and MySQL.

Looking ahead, our plan is to get the PR accepted and push the code to Dagster’s official repository. To achieve this, we’re working on better documentation and testing so that anyone who wants to use MariaDB as part of their Dagster pipelines can get started quickly.

Do you need any help getting the pull request accepted, and how can users try out the integration now?

Revanth: In terms of MariaDB support, one of the developers put in a lot of helpful comments on our PR, and we’ll continue our communication with them to ensure that any MariaDB-specific features we might have overlooked are included. We’ve received plenty of valuable feedback from the developers so far.

Ria: Meanwhile, before the PR is officially accepted, users can already test the integration. We have a readme that lists out all the steps for testing, so developers can test it on their own local systems. 

Our main goal for the next couple of weeks is to streamline our code based on the comments we’ve received – that’s our primary focus right now.

What was your experience participating in the hackathon?

Revanth: It went really well – we had a lot of fun working on this hackathon alongside our regular jobs. It gave us a great opportunity to step away from our daily work and learn new things. For me personally, it was a chance to get exposure to MariaDB and understand how data orchestrators like Dagster work. The ideation phase was also quite enjoyable, with lots of discussions about what we could build and which features would work best together.

Ria: What made it especially fun for me is that Revanth and I have worked on projects together before, but it had been a while since our last collaboration. This hackathon was like relearning our work processes together, and it was really rewarding. We were able to apply professional project management practices – using Jira and other tools – and it was interesting to see how those processes work outside of our regular jobs.

One suggestion for future hackathons would be to have more visibility into what other participants are working on. It would be interesting to see other project themes and get a sense of what everyone else is building. I’m not sure if that was available and I just missed it, but having that kind of community visibility would add another dimension to the experience.

Kaj: We would have loved to provide more personal attention and feedback to teams in the earlier phases, but with so many participants, it was really challenging to identify which projects needed our coaching. Your feedback about wanting more visibility is valuable, and we’re definitely taking note of it.

Robert: That’s a good takeaway. We’ve been discussing how transparent to make things transparent between teams – there’s a balance between visibility and making sure people don’t shy away from ideas because someone else is already working in that space. We’ll see how we can incorporate this feedback in future hackathons.

Thank you and closing thoughts

Kaj: We want to extend our thanks to Ria and Revanth not just for their excellent implementation, but for the valuable insight that a Dagster integration was needed for MariaDB. Honestly, if someone had asked us about our integration wish list earlier, we probably wouldn’t have thought to include Dagster – and that would have been a mistake. Dagster is emerging as one of the key orchestration platforms alongside Airflow, and having MariaDB support there is genuinely important for the ecosystem.

The value here goes beyond the code itself – it’s about identifying where MariaDB needs to be present in the modern data stack. So thank you, Ria and Revanth, for both the great idea and the great implementation.

We’re excited to see this integration move forward and eventually become part of the official Dagster repository. If you’re interested in trying it out or contributing, check out their repository and test documentation.

Further reading