Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Click HERE to check out previous “Heard on the Street” round-ups.
Billionaire-backed xAI open-sources Grok – Virtue signalling or true commitment? Commentary by Patrik Backman, General Partner at OpenOcean
“For once, Elon Musk is putting his principles into action. If you sue OpenAI for transforming into a profit-driven organization, you must be prepared to adhere to the same ideals. However, the reality remains that many startups are tired of larger corporations exploiting their open-source software and that not every company has the same options as the billionaire-backed xAI.
As we saw with HashiCorp or MongoDB’s strategic licensing decisions, navigating the balance between open innovation and financial sustainability is complex. Open-source projects, especially those with the potential to redefine our relationship with technology, must carefully consider their licensing models to ensure they are able to operate while staying true to their core ethos. These models should facilitate innovation, true, but they should also guard against the monopolization of technologies that have the potential to permanently impact humanity.”
On the passing of the EU AI Act. Commentary by Jonas Jacobi, CEO & co-founder of ValidMind
“While we don’t know the full scope of how the EU AI Act will affect American businesses, it’s clear that in order for enterprise companies to operate internationally, they’re going to have to adhere to the Act. That will be nothing new for many. Large American corporations that operate globally are already navigating complex regulatory environments like the GDPR, often choosing to apply these standards universally across their operations because it’s easier than having one set of rules for doing business domestically and another set of rules internationally. Small and midsize companies who are implementing or thinking about an AI strategy should stay informed and vigilant. As these global regulations and standards evolve, even primarily U.S.-based companies operating domestically will want to tailor their strategies to adhere to these standards. Recent news stories have made it clear that we can’t just rely on businesses to ‘do the right thing.’ Therefore, my advice to small and midsize companies is to use the EU AI Act as a North Star when building their AI strategy. Now is the time to build strong compliance, responsible AI governance, and robust, validated practices that will keep them competitive and reduce disruption if and when US-centric regulations are handed down.”
Platform engineering reduces developer cognitive load. Commentary by Peter Kreslins, CTO and co-Founder at Digibee
“Platform engineering is the latest way organizations are improving developer productivity, with Gartner forecasting that 80% of large software engineering organizations will establish platform engineering teams by 2026. It helps developers reduce cognitive load by shifting down to the platform all tedious and repetitive tasks while maintaining governance and compliance.
The same way cloud computing abstracted data center complexity away, platform engineering abstracts software delivery complexities away. With the application of platform engineering principles, software developers can focus on more value generating activities rather than trying to understand the intricacies of their delivery stack.”
Overcoming Compliance: The Transformative Potential of Semantic Models in the Era of GenAI. Commentary by Matthieu Jonglez, VP of Technology – Application & Data Platform at Progress
“Combining generative AI and semantics is crucial for businesses dealing with data governance and compliance complexities in their AI deployment. Semantic models dive into the context of data, understanding not just the surface-level “what” but the underlying “why” and “how.” By grasping this, we enable AI to identify and mitigate biases and tackle privacy concerns, especially when dealing with sensitive information. In a sense, it equips AI with a human-like context, guiding it in making decisions that align with logical and ethical standards. This integration ensures that AI operations don’t just blindly follow data but interpret it with real-world sensibilities, compliance requirements and data governance policies in mind.
Semantic models also help with transparency and auditability around AI decision-making. These models help drive towards “explainable AI”. Gone are the days of “black box” AI, replaced by a more transparent, accountable system where decisions are not just made but can be explained. This transparency is crucial for building trust in AI systems, ensuring stakeholders can see the rationale behind AI-driven decisions.
Additionally, it plays a pivotal role in maintaining compliance. For any forward-thinking business, integrating generative AI with semantics and knowledge graphs isn’t just about staying ahead in innovation; it’s about doing so responsibly, ensuring that AI remains a reliable, compliant, and understandable tool grounded in data governance.”
Data teams are burned out – here’s how leaders can fix it. Commentary by Drew Banin, Co-Founder of dbt Labs
“Most business leaders don’t realize just how burned out their data teams are. The value that strong data insights bring to an organization is no secret, but it’s an issue if teams aren’t operating at their best. In the face of unrealistic timelines, conflicting priorities, and the burden of being the core data whisperers within an organization, these practitioners are exhausted. Not only do they have to manage tremendous workloads, but they also frequently experience minimal executive visibility. Unfortunately, it’s not uncommon for leadership to have a poor understanding of what data teams actually do.
So, what can we do about it? First, business leaders need to be mindful of the work given to their data teams. Is it busy work that won’t meaningfully move the needle, or is it impactful – and business critical? Most people – data folks included – want to see their efforts make a difference. By finding a way to trace these efforts to an outcome, motivation will go up while burnout is reduced.
Leaders could also improve their understanding of data practitioners’ workflow and responsibilities. By digging into what makes a given data project challenging, leaders might find that a small change to an upstream process could save data folks tons of time (and heartache), freeing the team up to do higher leverage and more fulfilling work. Leaders can help their data team be successful by equipping them with the right context, tools, and resources to have an outsized impact in the organization.
Once executives have more visibility into their data teams’ work and responsibilities, and are able to focus them on high impact projects, organizations will not only have a wealth of business critical insights at their fingertips, but more importantly, they’ll have a crew of engaged, capable, and eager data practitioners.”
Ethical implications of not using AI when it can effectively benefit legal clients, provided that its outputs are properly vetted. Commentary by Anush Emelianova, Senior Manager at DISCO
“Lawyers should consider the ethical implications of not using AI when AI is effective at driving good results for clients, when AI output is properly vetted.
As we have seen from cases like Mata v. Avianca, lawyers must verify the output of generative AI tools, and can’t simply take the output as true. But this is no different from traditional legal practice. Any new associate learns that she can’t just copy and paste a compelling-sounding quote from case law — it’s important to read the whole opinion as well as check whether it’s still good law. Yet lawyers have not had to get consent from clients to use secondary sources (which summarize case law, and pose the same kind of shortcut risk as generative AI tools).
Similarly, an LLM tool that attempts to predict how a judge will rule is not significantly different than an experienced lawyer who reads the judge’s opinions and draws conclusions about the judge’s underlying philosophy. Generative AI tools can drive efficiency when output is verified using legal judgment, so I hope bar associations do not create artificial barriers to adoption like requiring client consent to use generative AI — especially since this does not tackle the real issue. We will continue to see courts imposing sanctions when lawyers improperly rely on false generative AI output. This is a better approach because it incentivizes lawyers to use generative AI properly, improving their client representation.”
Data breaches. Commentary by Ron Reiter, co-founder and CTO, Sentra
“Third-party breaches continue to make headlines –– in this month alone, we’ve seen them affect American Express, Fidelity Investments and Roku –– especially with organizations becoming more technologically integrated as the global supply chain expands. Because of this, organizations struggle to visualize where their sensitive data is moving and what is being shared with their third parties –– and these smaller third-party companies often aren’t equipped with the right cybersecurity measures to protect the data.
While third-party attacks are nothing new, there are new tools and strategies organizations can adopt to more effectively prevent and combat data breaches. By adopting innovative data security technology such as AI/ML-based analysis and GenAI assistants and other LLM engines, security teams can easily and quickly discover where sensitive data is residing and moving across their organization’s ecosystem, including suppliers, vendors, and other third-party partners. By implementing AI technologies into data security processes, teams can bolster their security posture. Through GenAI abilities to answer complex queries to assess the potential risks associated with third parties and provide actionable insights, it’s easier to detect sensitive data that has moved outside of the organization. GenAI tools provide the ability to ensure correct data access permissions, enforce compliance regulations and offer remediation guidelines for containing threats. They can additionally ensure data security best practices are implemented by users in less technical roles including audit, compliance and privacy, supporting a holistic security approach and fostering a culture of cybersecurity across the organization.”
The Role of AI and Data Analytics in Real Estate Institutional Knowledge Preservation. Commentary by Matthew Phinney, Chief Technology Officer at Northspyre
“While the bulk of the real estate industry has historically been reluctant to embrace technology, commercial real estate developers are now acknowledging its clear benefits, particularly in addressing corporate instability, including high turnover rates. The real estate industry is notorious for its subpar data warehousing. When team members depart, valuable institutional knowledge is rarely handed over well, which means data is either lost forever or left in fragmented datasets that are spread across ad hoc emails and spreadsheets.
However, developers are finally realizing AI’s capacity to address this issue. AI-powered technology that can capture data and retrieve relevant insights can remove the decades-old siloes and improve collaboration among team members. Using these technologies, professionals can easily move from project to project while maintaining access to essential portfolio data that enables them to make informed decisions further down the line. Moreover, AI can streamline routine administrative tasks like financial reporting by extracting the necessary data and packaging it into comprehensive reports, minimizing the risk of human error and reducing the time spent deciphering information from scattered sources. As a result of leveraging this type of technology, development teams have begun seeing a significant increase in efficiency in their workflows while avoiding the setbacks historically associated with significant turnover.”
Rapid AI advancements need to be balanced with new ways of thinking about protecting privacy. Commentary by Craig Sellars, Co-Founder and CEO of SELF
“AI models’ voracious appetite for data raises legitimate concerns about privacy and security, particularly in light of our outmoded data and identity paradigms. To begin, we have all of the challenges resident in big data governance from navigating a complex regulatory and compliance landscape to securing sensitive data against criminal attacks. AI’s nature complicates the matter further by creating additional attack surfaces. For example, users of AI chatbots frequently and sometimes unknowingly provide sensitive personal information, including confidential intellectual property, which then becomes incorporated into the AI’s knowledge base.
AI’s capabilities also extend privacy risks beyond the realm of data governance. The technology is uniquely well suited to analyzing vast amounts of data and drawing inferences. In a world where countless disconnected data points comprise individuals’ digital footprints, AI has the potential to supercharge everything from basic digital surveillance (e.g., sites you browse and ads you click) all the way to drawing conclusions about medical conditions or other protected topics. What’s more, AI’s capacity to adapt and respond in real time opens up opportunities for scammers to prey on others using deepfakes, cloned voices, and similar technologies to compromise people’s valuable financial data.
The critical through-line for all of these vulnerabilities is that they exist solely because of, or are accelerated by, the default notion that business and other online entities should extract data points from users via digital surveillance. This core assumption that individuals don’t own their own data naturally leads to the creation of large, centralized data assets that AI can consume, misuse and exploit. Our best defense against these vulnerabilities isn’t additional governance or regulation, but rather our ability to develop novel technologies in parallel with AI that will enhance data security for individuals, giving them more nuanced control over whether and how their data – their identity assets – are shared with external parties.”
Motivation behind CSPs’ reduction in Egress Fees. Commentary by John Mao, VP of Global Business Development at VAST Data
“In the wake of AI and as organizations continue to capture, copy, store, consume and process data at a breakneck pace, global data creation is expected to rapidly increase over the next several years. Naturally, cloud service providers (CSPs) are vying for market share of these organizations’ most precious asset, and reducing or even eliminating egress fees has become a strategic business move to attract customers. What began as an initiative by one provider quickly became a hyperscaler industry-wide trend driven by customer demand.
Data driven organizations today recognize that different cloud providers offer different strengths and service offerings, making hybrid and multi-cloud environments more and more popular. With this in mind, these same organizations are cloud cost-conscious as their data sets continue to grow. However, these reduced egress fees likely won’t be enough to warrant any significant changes (outside of the expected growth-line) to cloud adoption. In fact, in most instances, these fees are only waived if an organization is moving all of their data off of a cloud, and may not do much to alleviate the cost of day-to-day data migrations between clouds.
Today’s customers prioritize contracts that offer flexibility, enabling them the freedom to migrate data to and from their preferred CSPs based on the workload or application without the constraints and limitations of vendor lock-in. This trend signals a potential shift and the right steps towards unlocking true hybrid cloud architecture.”
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW