Big data has always required specialized expertise. As datasets grow into tens of terabytes and workloads run across hundreds of machines, performance becomes just as important as correctness. At the same time, AI coding tools have become mainstream for software development, yet they fall short when dealing with the performance constraints of distributed data systems. DataFlint, founded by Meni Shmueli, aims to close that gap.
Bringing AI Coding to Big Data
Meni explains that while tools like GitHub Copilot or Cursor can generate code, they cannot optimize big data applications in real production environments. These applications rely on fine-tuned performance adjustments that depend on understanding how the code behaves on clusters, how it processes huge datasets, and where bottlenecks form.
DataFlint solves this by collecting real performance data from big data engines and converting those insights into actionable context for AI coding assistants. Once the AI receives that context, it can actually write optimized code that performs correctly at scale. Developers can ask their editor to fix or improve a Spark job, and the tool uses DataFlint’s insights to return an optimized version.
The Problem Meni Saw Up Close
The idea for DataFlint came directly from Meni’s own experience. After serving in Unit 81, he moved into big data and AI engineering roles. He noticed that data analysts and data scientists often found themselves stuck. Their queries timed out. Their jobs ran indefinitely. They wanted to build models and products, but performance issues slowed them down.
They would come to Meni for help, but even with his expertise, solving these performance problems across an entire organization was unrealistic without new tooling. He began thinking about how to scale his knowledge across a team. When AI coding assistants emerged, the timing aligned perfectly. If he could feed the AI the right performance context, it could write optimized big data code automatically. That became the foundation of DataFlint.
Why Advances in AI Make DataFlint More Important, Not Less
Some founders worry that advances in AI could make their products obsolete. Meni sees the opposite. As AI becomes more capable, it requires more domain-specific context to solve complex problems. AI alone cannot know how a Spark job performs on a real production cluster or how a dataset behaves under scale. It needs performance insights that only DataFlint can provide.
In other words, the more powerful AI becomes, the more it needs relevant data and context to generate correct solutions. DataFlint sits at that intersection.
An Ecosystem Built on Community
Big data has long been rooted in open-source culture. From its earliest days, the ecosystem grew through shared engines, meetups, and collective innovation. Databricks itself emerged from the creators of Apache Spark, one of the most widely used big data engines. Meni sees that culture continuing.
DataFlint contributes to the ecosystem with open-source tools, Apache involvement, and a growing list of community meetups in Tel Aviv, New York, and soon San Francisco. Their newsletters and educational content help engineers better understand performance optimization and modern big data practices. For DataFlint, community is not marketing. It is part of the DNA of big data.
What’s Next for DataFlint
Meni believes most code will eventually be written by AI, and the key to getting it right will be context. The next phase of DataFlint involves:
- Expanding the performance insights fed into AI tools
- Growing the global community of big data engineers
- Hosting more meetups with major partners such as Microsoft and Akamai
The goal is to support engineers as their work shifts from writing code line by line to designing systems, defining logic, and letting AI handle the implementation.
Will AI Reduce the Number of Engineers?
A common question is whether AI will eliminate engineering jobs. Meni’s view is more nuanced. In big data, much of the work is not writing code but understanding the domain and translating business goals into data systems. Surveys already show that big data engineers and machine learning engineers are among the most in-demand roles in the AI era.
Rather than replacing engineers, AI will amplify their abilities. It will let them build more intelligent products, ship faster, and focus on architecture rather than syntax. In other words, AI will not eliminate big data engineers. It will require more of them.
Closing Thoughts
DataFlint represents a shift in how complex data systems will be built in the years ahead. As workloads grow and AI takes on a larger share of software creation, performance context becomes essential. By bridging AI coding tools with real production insights, DataFlint gives engineers the ability to build scalable, efficient systems without getting stuck on the bottlenecks that once slowed entire teams.
As the big data community expands and AI tools become central to development, companies that provide this layer of intelligence will define the next generation of data engineering. DataFlint is already positioning itself at the heart of that transformation.
Interested in more insights about Israeli startups expanding globally? Follow IsraelTech for exclusive interviews, deep dives, and expert perspectives.