Thursday, January 23, 2025

Why data science alone won’t make your product successful


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


The last decade has seen the divide between tech and commercial teams thin almost to the point of nonexistence. And I, for one, am in favor of it. Not every tech team works in a tech company, and blurring the lines between the commercial and technological means that we can build and ship product safe in the knowledge that it will be well received, widely adopted (not always a given), and contribute meaningfully to the bottom line. Name a better way to motivate a high-performance tech team, and I’ll listen. 

It’s a change that was accelerated — if not caused by — data tech. We’ve spent decades working through big data, business intelligence, and AI hype cycles. Each introduced new skills, problems and collaborators for the CTO and their team to get to grips with, and each moved us just a little further from the rest of the organization; no one else can do what we do, but everyone needs it done.

Technical teams are not inherently commercial, and as these roles expanded to include building and delivering tools to support various teams across the organization, this gap became increasingly apparent. We’ve all seen the stats about the number of data science projects, in particular, that never get productionized — and it’s little wonder why. Tools built for commercial teams by people who don’t fully understand their needs, goals or processes will always be of limited use. 

This waste of technology dollars was immensely justifiable in the early days of AI — investors wanted to see investment in the technology, not outcomes — but the tech has matured, and the market has shifted. Now, we have to show actual returns on our technology investments, which means delivering innovations that have a measurable impact on the bottom line. 

Transitioning from support to a core function

The growing pains of the data tech hype cycles have delivered two incredible boons to the modern CTO and their team (over and above the introduction of tools like machine learning (ML) and AI). The first is a mature, centralized data architecture that removes historical data silos across the business and gives us a clear picture — for the first time — of exactly what’s happening on a commercial level and how one team’s actions affect another. The second is the move from a support function to a core function.  

This second one is important. As a core function, tech workers now have a seat at the table alongside their commercial colleagues, and these relationships help to foster a greater understanding of processes outside of the technology team, including what these colleagues need to achieve and how that impacts the business. 

This, in turn, has given rise to new ways of working. For the first time, technical individuals are no longer squirreled away, fielding unconnected requests from across the business to pull this stat or crunch this data. Instead, they can finally see the impact they have on the business in monetary terms. It’s a rewarding viewpoint and one that has given rise to a new way of working; an approach that maximizes this contribution and aims to generate as much value as quickly as possible.  

Introducing lean value

I hesitate to add another project management methodology to the lexicon, but lean-value warrants some consideration, particularly in an environment where return on tech investment is so heavily scrutinized. The guiding principle is ‘ruthless prioritization to maximize value.’ For my team, that means prioritizing research with the highest likelihood of either delivering value or progressing organizational goals. It also means deprioritizing non-critical tasks.

We focus on attaining a minimum viable product (MVP), applying lean principles across engineering and architecture, and — here’s the tricky bit — actively avoiding a perfect build in the initial pass. Each week, we review non-functional requirements and reprioritize them based on our objectives. This approach reduces unnecessary code and prevents teams from getting sidetracked or losing sight of the bigger picture. It’s a way of working we’ve also found to be inclusive of neurodiverse individuals within the team, since there’s a very clear framework to remain anchored to.  

The result has been accelerated product rollouts. We have a dispersed, international team and operate a modular microservice architecture, which lends itself well to the lean-value approach. Weekly reviews keep us focused and prevent unnecessary development — itself a time saver — while allowing us to make changes incrementally and so avoid extensive redesigns. 

Leveraging LLMs to improve quality and speed up delivery 

We set quality levels we must achieve, but opting for efficiency over perfection means we’re pragmatic about using tools such as AI-generated code. GPT 4o can save us time and money by generating architecture and feature recommendations. Our senior staff then spend their time critically assessing and refining those recommendations instead of writing the code from scratch themselves.   

There will be plenty who find that particular approach a turn-off or short-sighted, but we’re careful to mitigate risks. Each build increment must be production-ready, refined and approved before we move on to the next. There is never a stage at which humans are out of the loop. All code  — especially generated  — is overseen and approved by experienced team members in line with our own ethical and technical codes of conduct. 

Data lakehouses: lean value data architecture

Inevitably, the lean-value framework spilled out into other areas of our process, and embracing large language models (LLMs) as a time-saving tool led us to data lakehousing; a portmanteau of data lake and data warehouse.

Standardizing data and structuring unstructured data to deliver an enterprise data warehouse (EDW) is a years-long process, and it comes with downsides. EDWs are rigid, expensive and have limited utility for unstructured data or varied data formats. 

Whereas a data lakehouse can store both structured and unstructured data, using LLMs to process this reduces the time required to standardize and structure data and automatically transforms it into valuable insight. The lakehouse provides a single platform for data management that can support both analytics and ML workflows and requires fewer resources from the team to set up and manage. Combining LLMs and data lakehouses speeds up time to value, reduces costs, and maximizes ROI.

As with the lean-value approach to product development, this lean-value approach to data architecture requires some guardrails. Teams need to have robust and well-considered data governance in place to maintain quality, security and compliance. Balancing the performance of querying large datasets while maintaining cost efficiency is also an ongoing challenge that requires constant performance optimization.

A seat at the table

The lean-value approach is a framework with the potential to change how technology teams integrate AI insight with strategic planning. It allows us to deliver meaningfully for our organizations, motivates high-performing teams and ensures they’re used to maximum efficiency. Critically for the CTO, it ensures that the return on technology investments is clear and measurable, creating a culture in which the technology department drives commercial objectives and contributes as much to revenue as departments such as sales or marketing.

Raghu Punnamraju is CTO at Velocity Clinical Research.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers


Related Articles

Latest Articles