Episode 11

Wasabi AiR and the Future of Object Storage with Aaron Edell

Host Chris Lacinak is joined by Aaron Edell, SVP of AI at Wasabi to talk about the recently launched Wasabi AiR, AI, object storage, and more. Wasabi AiR is an innovative and strategic development in the DAMosphere that aims to change business as usual and push boundaries with what Aaron describes as redefining object storage. This is a fascinating deep dive and behind the scenes look into an important milestone in the intersection between AI and digital asset management from a leader and expert in the field.

Disclaimer: Neither Chris nor AVP can be held responsible for Aaron's ownership of, or affinity for a Cybertruck :-).

Guest Bio: 

Former GrayMeta CEO Aaron Edell joined Wasabi as SVP of AI & Machine Learning in 2024, following Wasabi's acquisition of Curio AI From GrayMeta. Aaron leads the team tasked with developing the company's intelligent cloud storage technology.

Aaron brings deep-rooted experience in media & entertainment product development, innovation, and digital asset management to his newly created role at Wasabi. He was responsible for building many of GrayMeta’s first products, including Curio, and grew GrayMeta into an essential workflow component for the likes of Netflix, Walt Disney Studios, and NBC Universal. He also co-founded Machine Box, an AI company that revolutionized where machine learning could run, and served as Global Head of Business for AWS, leading its Customer Cloud Intelligence and FinOps division.

Resources mentioned in this episode:

Engage:

🆓 Download amazing and free DAM resources from the best DAM consultants in the business at https://weareavp.com/free-resources

⭐ Please rate, like, follow, and subscribe on your podcast platform of choice. See all the places we are at https://dam-right.com/listen

🔗 Follow me on LinkedIn at https://linkedin.com/in/clacinak

License info:

Music from Uppbeat (free for Creators!):

https://uppbeat.io/t/hey-pluto/the-gentleman

License code: PYLFGIFQ88M6KNBS

Transcript
Chris Lacinak:

The practice of using AI to generate metadata has been around for almost a decade now.

Even with pretty sophisticated and high-quality platforms and tools, it's still fair to say that the hype has far outpaced the adoption and utilization.

My guest today is Aaron Edell from Wasabi.

Aaron is one of the folks that is working on making AI so easy to use that we collectively

glide over the hurdle of putting effort into using AI and find ourselves happily reaping

the rewards without ever having had to do much work to get there.

It's interesting to note the commonalities and approach with both Aaron and the AMP Project

folks who I spoke with a couple of episodes ago.

Both looked at this problem and aimed to tackle it by bringing together a suite of AI tools

into a platform that orchestrates their capabilities to produce a result that is greater than the

sum of their individual parts.

Aaron is currently the SVP of AI at Wasabi.

Prior to this, he was the CEO of GreyMeta, served as the Global Head of Business and

GTM at Amazon Web Services, and was involved in multiple AI and ML businesses in founding

and leadership roles.

Aaron's current focus is on the Wasabi AiR platform, which they announced just before

I interviewed him.

I think you'll find his insights to be interesting and thought-provoking.

He's clearly someone who has thought about this topic a lot, and he has a lot to share

that listeners will find valuable and fun.

Before we dive in, I would really appreciate it if you would take two seconds to follow,

rate, or subscribe on your platform of choice.

And remember, DAM Right, because it's too important to get wrong.

Aaron Edell, welcome to the DAM Right podcast.

Great to have you here.

Aaron Edell:

It's an honor.

Chris Lacinak:

Thank you for having me. I'm very excited to talk to you today for a number of reasons.

One, you've recently announced a really exciting development at Wasabi. Can't wait to talk about that.

But also, our career paths have paralleled and intersected in kind of strange ways over

the past couple decades.

We both have a career start and an intersection around a guy by the name of Jim Lindner, who

was the founder of Vidipax, a place that I worked for a number of years before I started

AVP, and who was also the founder of Samba, where you kind of, I won't say you started

there, you had a career before that, but that's where our intersection started.

But I'd love for you to tell me a bit about your history and your path that brought you

to where you are today.

Aaron Edell:

Yeah, definitely.

The other funny thing about Jim is that he is a fellow tall person. So folks who are listening to this can't tell, but I'm six foot six, and I believe Jim is

also six six or maybe six seven.

So when you get to that height, there's a little Wi-Fi that goes on between people of

similar height that you just make a little connection.

You kind of look at each other and go, "I know your pain.

I know your back hurts."

So my whole life growing up, ever since really I was five years old, I loved video, recording,

shooting movies, filming things.

I eventually went to college for it.

I did it a lot in high school.

And this is back in the early 90s when video editing was hard.

And the kid in high school who knew how to do it and had the Mac who could do it was

kind of the only person able to actually create content.

So I was rarefied, I guess, in that sense.

So I would go to film festivals and all sorts, and it was just great time.

And I was never very good at it.

I just really loved it.

And when you love something, especially when you're young, you learn all of the things

you need to know to accomplish that.

So I learned a lot about digital video just because I had to figure out how to get my

stupid Mac to record and transcode.

And then I got introduced to nonlinear editing very early on and learning that.

So when I went to college, I went there for film and video, really.

That was what I thought I wanted to be when I grew up was a filmmaker.

My father was talent for KGO television and ABC News for a long time.

So I had some familial-- and my mother was the executive producer of his radio show.

So I had a lot of familial, sort of, media and entertainment world around and was very

supported in that way, I suppose.

By the time I got-- so I went to college, and I loved my college.

Hampshire College is a fantastic institution.

It has no tests, no grades.

It has a-- you design your own education, which is not something I was prepared for,

by the way, when I went there.

I'm so thrilled I went there because all of my entrepreneurial success is because of what

I learned there.

But at the time, I had no appreciation for that.

And I just thought, well, this is strange.

I'm here for film and video, and they're like, here's a camera.

Here's a recording button.

And I thought, mm, this is an expensive private college in Massachusetts and probably need

to make it a little bit harder.

So my father is a physician, so I thought pre-med.

And I did it.

I went full on pre-med.

I was going to be a doctor.

I was going to apply to medical school.

But I was also working on documentaries and producing stuff and acting in other people's

films and things like that.

So I still-- that love, that passion never went away.

I was just kind of being creative about how to do it.

And my thesis project ended up being a documentary about a medical subject, which was kind of

perfect.

Because at the end of the day, my father, he's a physician, but he's actually a medical reporter.

And that's a whole separate field that fascinated me.

So when I graduated, I was like, OK.

I went and actually got a job producing and editing a show for PBS, which was super cool

in New York City.

And that was around:

I was doing it for a couple of years.

And we were-- it was a PBS show, so we were very reliant on donations and whatnot.

And:

It dried up.

We ran out of money.

And I was looking for a job.

And I worked on a couple of movies that were being shot in the city.

And I found this job at this weird company called SAMMA Systems on 10th Avenue and 33rd

Street or something that was Jim Lindner's company.

That came to learn later.

But they were making these robotic systems that would migrate videocassette tapes to

a digital format.

So think of a bunch of tape decks on top of each other with a gripper going up and down

and pulling videotapes out of a library, putting them in, waiting for them to be digitized,

taking them out, cleaning them-- not in that order, but essentially that way.

And I was just fascinated.

I mean, it was so cool.

Building robots.

Chris Lacinak:

Yeah.

Aaron Edell:

You know, video.

It was everything I loved kind of in one. And the rest is just really history from there.

Chris Lacinak:

Yeah.

So we have another intersection that I didn't know about, which was Hampshire College, although I was denied by Hampshire College.

So you definitely one-upped me on that.

Which I taught at NYU in the MIAP program, and Bill Brand also taught there, also taught

at Hampshire College.

And I told him that I was denied by Hampshire College.

And he said, I didn't know they denied people from Hampshire College.

Aaron Edell:

Oh, that makes it worse.

Chris Lacinak:

Anyway, all things happen for a reason.

It was all good. But that's very cool.

That is a great school.

And what a fascinating history there.

So it's not-- I mean, I still think there's-- let's connect the dots between working for

a company that was doing mass digitization of audiovisual and where you are today at

Wasabi.

Like, that is not necessarily easy to fill in that gap.

So tell us a little bit about how that happened.

Aaron Edell:

Yes.

Well, as my father likes to say, you know, life is simply a river. You just jump in and kind of flow down and you end up where you end up.

I don't think I could have engineered or controlled this.

p-- you know, SAMMA, this was:

If I could jump back, you know, and say to myself back then, this is where you're going

to end up, I would just been like, how?

How do you do that?

How is that possible?

So this is what happened.

I mean, I-- you know, SAMMA was very quickly acquired by a company called Front Porch Digital

in:

Very close to:

And Front Porch Digital, you know, created these products that were-- the core product

was called DIVA Archive, which still exists today, although it's owned by Telestream.

But essentially, it is-- you know, you've got your LTO tape robot and you've got your

disk storage and you have-- you're a broadcaster.

And you need some system to keep track of where all of these files and digital assets

live and exist.

And you've got to build in rules.

Like, take it off spinning disk if it's old.

Make sure that there's always two or three LTO tape backups.

You know, transcode a copy for my man over here.

Automation wants some video clip for the news segment.

You know, pull it off tape and put it here.

All of that kind of stuff was the DIVA Archive software.

And I'm oversimplifying.

But through that process, you know, I was-- I joined as the-- I was kind of bottom of

the rung, like, support engineer.

And I had delivered some SAMMA systems, you know, installed some and did a little product

managing just because we were-- you know, we needed it.

We were only eight people.

And I was probably the most knowledgeable of the system other than one or two people

at the time.

And so by the time I got to Front Porch Digital, you know, I was doing demos and I was-- I

was architecting solutions for customers.

So I was promoted to a solutions architect.

And that's kind of where I learned, you know, business, just like generic business stuff,

emails, quotes.

I learned about the tech industry and media and entertainment industry in particular and

how, you know, how sales works in those industries and how it doesn't work sometimes.

And all of the products that are-- that are involved.

So I was kind of, you know, getting a real good crash course of just how media and entertainment

works from a tech perspective and how to be a vendor in the space.

I did a brief stint at New Line.

For those of you who don't know New Line, I don't think it exists anymore, but it was

a company based in Long Island that kind of pioneered some of the like set-top box digital

video fast channel stuff.

And then-- but I was more or less at Front Porch for about seven years.

And then Front Porch was acquired by Oracle.

And working at Oracle was a very different experience.

You know, they are a very, very large company and they have a lot of products.

And I don't know, I just-- it just didn't feel like I could do my scrappy startup thing,

which I had kind of spent the last 10 years honing.

So that is-- so, you know, that is kind of at the point where I-- a sales guy that I

had worked with at Front Porch named Tim Stockhaus went off to California to start this company

called GrayMeta based on this idea that we were all kind of floating around, which is,

man, metadata is a real problem in the industry right now, especially as it relates to archives

and finding things.

So GrayMeta was founded on that idea.

When I joined there, I was the first or second employee.

So it was-- we were building it from scratch.

And I mean building everything, not just the product or the technology, but the sales motions

that go to market.

And that's where I learned all that stuff.

I quit GrayMeta about two years in to go start my own startup because I just wanted to do

it.

I wanted to be a founder.

I wanted to know what that was like.

And I, at that point, had learned a lot about machine learning and how it applies to the

media and entertainment industry, specifically around things like transcription and AI tags.

And a couple of my coworkers at GrayMeta had this really great idea that let's build our

own machine learning models and make them Docker containers that have their own API

built in and their own interface and just make them run anywhere, run on-prem, run in

the cloud, wherever you want.

Because it solved a lot of the problems at the time.

So we jumped ship.

We built the company.

It exploded.

I was the CEO.

My founders were the technical leaders.

And between the three of us, man, we were doing everything-- sales, marketing, building,

tech support, all of it.

And gosh, what a learning experience.

Also as a founder and CEO, you're raising money.

You've got to figure out how the IRS works.

You need to figure out how to incorporate stuff.

So a whole other learning experience for me.

a company called Veritone in:

Changed our lives.

I mean, we went through an acquisition.

We walked away with a lot of money.

And it was a whole new world.

Things open up, I guess, when that happens to you in business.

And I actually got recruited to join AWS.

And the funny thing is that it had nothing to do with media and entertainment or AI at

all.

AWS said, hey, you have a lot of experience taking situations where there's a lot of data

and simplifying it for people or building products to simplify it for people and make

it more consumable and understandable.

AWS has that problem with their cost and usage data.

Chris Lacinak:

Oh, interesting.

Aaron Edell:

Yeah. And you get a, especially if you use a lot of the cloud and you're a big company, you

get a bill. It's not really a bill.

You get like this data dump that's not human readable.

It's billions of lines long, has hundreds of columns.

You can't even open it in Excel.

It's like, how do I use this?

So AWS was like, go figure this out, man.

So I mean, gosh, it was such a great experience.

We built a whole business based on this idea.

We built a product.

We built a go to market function.

We changed how AWS and actually I think how the world consumes cloud spending.

I think we had that big of an impact, not to toot our own horn, but it was for me, for

my career and my learning as a human, wow.

Like seeing how you can impact the whole world.

Chris Lacinak:

Yeah.

Well, as a consumer of AWS web services, I'll say thanks because the billing definitely improved dramatically over the past several years.

So I know exactly what you mean.

And I see the manifestation of your work there.

I didn't realize though that that's what you're doing at AWS.

I did always have in my mind that it was on the AI front.

So that's really interesting that you kind of left that.

So in some ways, your role now is kind of a combination of the two in the sense that

Wasabi is a competitor to AWS, but you are very much in the AI space.

So tell us about what you're doing at Wasabi now.

Aaron Edell:

Yeah, absolutely.

Well, you know, it was your point is really spot on because one of the biggest problems, I think for customers of the cloud is that, and I learned this thoroughly, is that it's

not forecastable and it's really hard to actually figure out what you're spending money on.

And it's also can be expensive if you do it wrong.

There really is a right way to do cloud in a wrong way.

And it's not always obvious how to navigate that.

So when I first came up, so, you know, the board of GrayMeta called me while I was AWS,

you know, kind of chugging along and said, "Hey, Aaron, why don't you come and be CEO?"

And I thought, you know what, that's scary.

But it's also like, it's perfect because the Graymeta story, I feel like we never got to

finish telling it.

I left, you know, before we got to finish telling it.

And so I came back and I said, "Guys, I've now had the experience of creating our own

machine learning models and running a machine learning company, like one that actually makes

AI and solves problems.

Let's do that."

So that's when I met Wasabi, was very shortly after I came back.

And you are totally right, because when I met Wasabi, it was like a door opening with

all this, you know, heavenly light coming through in terms of cloud FinOps.

Because Wasabi is, you know, cloud object storage, just like S3 or Microsoft Blob, that

is just $7.99 per terabyte and, sorry, $6.99 per terabyte and just totally predictable.

Like, you don't get charged for API fees, you don't get charged for egress, which is

where the kind of complexity comes in for other hyperscalers in terms of cost optimization

and understanding your cloud use and cloud spend.

That's all the unpredictable stuff.

That's what makes it not forecastable.

So the fact that, you know, Wasabi has just like a flat per terabyte per month pricing

and there's just nothing else.

It's just elegant and simple and beautiful and very compelling for the kind of experience

I had in the, and we call it the FinOps space or cloud FinOps space, where for three and

a half years, all I heard were problems about that this solved, right?

So it just pinged in my brain immediately.

The connection with AI, you know, goes back even further in the sense that I had always

advocated for, I always believed fundamentally that the metadata for an object and the object

itself should be as closely held together as possible.

Because when you start separating them and they're serviced by different vendors or whatever,

that's where the problems can seep in.

And one of the best analogies for this that I can think of is, you know, our Wasabi CEO,

Dave Friend, I love how he put it because he always refers to, you know, a library needs

a card catalog, right?

You go into the library and the card catalog is in the library.

You don't go across the street to a different building for the card catalog, right?

It's the same concept.

I mean, it's obviously, you know, the physical world versus the virtual computer world, but

similar concept in the sense that, you know, the metadata that describes your content,

it should be as close to the content as possible because if it's not, you know, you are at

risk of losing data at the end of the day.

I mean, I've talked to so many customers that have these massive libraries, sometimes they're

LTO libraries, sometimes there are other kinds of libraries where they've lost the database,

right?

And, you know, in LTO, like you need a database.

You need to know what objects are written on what tape.

It's gone.

I mean, what do you do, right?

You're in such a bad, it's such a bad spot to be in.

So hopefully we're addressing that.

Chris Lacinak:

Yeah.

So that's, I remember reading something on your website or maybe a spec sheet or something for air, which said object storage without a catalog is like the internet without a search

engine or something.

So, and to take that, to tie that to your other analogy, it's like a library without

a card catalog, right?

You walk in, you just have to start pulling books off the shelves and seeing what you

find.

Although there, we have a lot of text-based information.

When you pull a tape out of a box or a file off of a server, there's a lot more research

to do than there is maybe even with a book.

So yeah.

Aaron Edell:

Yes.

Chris Lacinak:

So tell me, what does AiR stand for?

It's a capital A lowercase i capital R. Tell us about that. What's that mean?

Aaron Edell:

So I believe it stands for AI recognition.

Chris Lacinak:

Okay.

Aaron Edell:

And so the idea is that, so the product wasabi AiR is this new product and it's, you know,

the kind of combination of the acquits. So I guess we skipped the important part, which is that wasabi acquired the Curio product

and some of the people, including myself came over and the Curio product really was this

platform.

We called it a data platform, if you will, that when you pointed at video files and libraries

and archives, it literally, it would do the job of opening up each file, like you just

said and watch essentially watching it, you know, logging, you know, taking it, making

a transcript of all the speech, looking at OCR information.

So, you know, recognizing text on screen, recording that down, pull, you know, pulling

down faces, object recognition, basically creating a kind of rich metadata entry for

each file.

So this is where I think the, the, the kind of marriage between that technology and Wasabi

comes in because you're, we now have a way of essentially with wasabi AiR it's, you know,

it's your standard object storage bucket.

Now you can just say anything that's in that bucket.

I want it, I want a metadata index for that.

We'll just do automatically with machine learning and you have access to that and you can search

and you can see the metadata along a timeline, which is really kind of turning out to be

quite unique.

I'm surprised that I don't see that at a lot of other places in specifically seeing the

metadata along the timeline.

And that's important because the whole point, it's not just search, it's not just, I want

to find assets where there's a guy wearing a green shirt with the Wasabi logo.

I want to know where in that asset those things appear because I'm an editor and I need to

jump to those moments quickly.

Chris Lacinak:

Right, right, right.

Aaron Edell:

So that, that's, that's what we're doing at, at, at wasabi with wasabi AIR.

And that's, that's why AiR stands for recognition, AI recognition, because you know, we're essentially running AI against and recognizing objects, logos, faces, people, sounds for all your

assets.

So I want to dive into that, but before we do that, on the acquisition front, did Wasabi acquire

a product from GrayMeta or did wasabi acquire GrayMeta?

Wasabi acquired the product, the Curio product.

So GrayMeta still exists.

In fact, it's really quite, is thriving with the Iris product and the SAMMA product, which

we talked about SAMMA.

That was the other piece I skipped over that too.

When I, when they called me and said, come be CEO of GrayMeta, it really made sense because

SAMMA was part of that story.

And that, that was like a connection to my first job in tech, which was wonderful because

I love, I love the SAMMA product.

I mean, we were, we were preserving the world's history, you know, the, the National Archives

and Library, the Library, US Library of Congress, the Shoah foundation, the, you know, criminal

tribunal in the Rwandan genocide from the UN, like just history.

So anyway, I digress.

Chris Lacinak:

Well, no, I mean, actually the last step, as we sit here today and talk, the last episode

that aired was with the video of Fortunoff, the Fortunoff Video Archive for Holocaust Testimonies, which was, I think one of the first, if not the first SAMMA users.

So that, that definitely ties in.

s that around, I think it was:

maybe:

I remember wandering around the NAB floor and, and for the past several months had been

having conversations with Indiana university about this concept of a project around, you

know, this, this, they, they had just digitized or actually were in the process of digitizing

hundreds of thousands of hours of content, video, film, audio.

And they had the problem that they had to figure out metadata for it.

You know, they had some metadata in some cases, in other cases, they didn't have any, in other

cases it wasn't dependable.

So we, we were working on a project that was how does Indiana university and others tackle

the challenge of the generation of massive amounts of metadata that is meaningful.

And so we, that, that was the spawning of this project, which became known as AMP.

And by the time this episode airs, we will have aired an episode about AMP, but I was

wandering around the NAB floor.

I come across GrayMeta.

As I remember, it was in like the backup against the wall.

And and I'm like, Oh my God, this is the thing we've been talking about.

Like it was kind of like this amazing realization that you know, other folks were doing great

work on that front as well.

I think at the same time there was maybe Perfect Memory.

I mean, they're, they're one of the ones who I see doing metadata on the timeline and in

a kind of a similar way that you're talking about, but but yeah, there weren't, there

weren't a lot of folks that were tackling that issue.

So it's really cool one to have seen the evolution.

Do I have that timeline right?

Was it about like:

Was that you had a product at that point?

I remember seeing it.

So like you had been working.

Aaron Edell:

Yeah, so I, so we, I joined, I was, like I said, the second employee at GrayMeta, which

would have been August of:

Right.

It must have been.

Chris Lacinak:

Yep.

Aaron Edell:

Yes.

So we, we did have a big booth and we had a product, but it's possible. I can't remember exactly when it is we introduced machine learning for the tagging is possible.

It was by then.

Yeah.

But it wasn't right away that originally we were just scraping exif and header data from

files and, and sort of putting a, putting that in its own database, which yeah, it's

cool.

It's useful.

But when, when machine learning came out, holy cow, I mean just speech to text alone.

Yeah.

Think of the searchability.

Yeah.

s was definitely a problem in:

so for many years was that your only option was to use the machine learning as a service

capabilities from the hyperscalers and they were great, but they were very expensive.

Chris Lacinak:

Yeah.

Aaron Edell:

And talk about like cost optimization.

You know, we would even as testers, we would get bills from, from these cloud providers that, that shocked us after running it, running the machine learning.

So we, it's why we started Machine Box was because it just, we just didn't think it had

to be that, that, that that was the only way to do it.

And, and it was a problem.

Like we were having trouble getting customers because it was just too expensive.

That's all been solved now.

But, but that's why I think this is why it's interesting because the, the, it's really

good validation that you guys, that other people had come up with the same idea.

That to me is a great sign.

Whenever I see that when independently different organizations and different people kind of

come to the same conclusion that, yeah, this is a problem.

We can solve it this way.

But I think it's taken this long to do it in a way that's affordable, honestly, and

secure.

And also the accuracy has really improved since those early days.

Chris Lacinak:

Yeah.

Aaron Edell:

It's gotten to the point where it's like, actually this, I can use this.

This is a pretty, the transcripts in particular are sometimes 90 to 99 to a hundred percent accurate even with weird accents and in different languages and all sorts.

Chris Lacinak:

Yeah.

I agree. It's, it's, it's, it's come a long way to where it's, it's production ready in many

ways.

Let me ask you though, from a different angle, from the, from the customer angle, do you,

what are your thoughts on whether consumers are ready to put this level of sophistication

to use?

What do you see out there?

Do you see wide adoption?

Are you struggling with that?

What's that look like?

Aaron Edell:

So do you mean, you mean from the perspective of like, Hey, I've got a Dropbox account or something and I want to, I want to process it with AI?

Chris Lacinak:

Well, I think there's, I think about it in a few ways.

One is, are people prepared? And here let's think about logistics and technology.

They have their files in a given place.

They know what they know, what they want to do.

They can provide access, they can do all those things.

But the other is kind of policy wise, leveraging the outputs of, of, of something like Wasabi

AiR to be able to really put it to use in service of their mission and providing access,

preservation, whatever those goals are.

Do you, I guess I'm wanting readiness on both those fronts.

Do you, do you see that as a challenge or do you find people are diving in whole hog

here?

What do you think?

Aaron Edell:

I think, I think people are diving in.

I think we've really reached the point now where I do think it's kind of, it's a combination of the accuracy and the sort of cost to do it.

Because if it's not very accurate and very expensive, that's a problem.

If it's very accurate and very expensive, it's still a problem.

But but we're at a point now where we can do it inexpensively and accurately.

And so I'll mention that even just today, which, which, you know, by the time folks

listen to this, it'll probably be a few weeks in the past now or so.

But Fortune magazine published a post about Wasabi AiR and the Liverpool Football Club.

And they, what I, what I love is that they make it very clear, right?

Their use case, which is we want our, the fans of the football club to be able to go

onto an app and just watch highlights of, you know, Mohamed Salah crushing Man U, right?

Manchester United.

And just get it like a quick 30 second compilation of like all the goals or whatever, you know,

just just fan engagement.

And in order to accomplish that, you know, Liverpool has unbelievable amounts of video

content from every game from multiple cameras.

They're, you know, they're, I think people imagine that there's there's like a whole

bank of editors sitting around with nothing better to do.

It's not really true.

They don't, they don't have that many editors.

And these editors have to, you know, create content from all of this library and archive

constantly and based basically Wasabi AiR makes them do it so much faster that they

can actually have an abundance of content ready for their app, which helps with increases

fan engagement.

And it's that simple for them.

And they like the quote in the article from Drew Crisp, who is their senior vice president

of their digital world, says that that's how they think about applying AI.

You know, we want to solve this use case.

We want to be able to create this 30 second compilation of all these goals.

Maybe it's against a specific team or whatever the context is.

But we can't sit around for hours and hours and hours watching every single second and

maybe manually logging things or tagging things or, you know, it's always like, it's always

a, it always happens after the fact, right?

You've recorded it all.

Okay, now it's on a, it's safe on, it's on a disk.

I've got all my footage.

And then maybe you, you know, you in the file name, you put the team you played, but that's

not enough metadata.

So, yeah, so I think they are ready.

I think, you know, it's, it's, um, people have to think about it the right way.

You know, this is a productivity boost.

This is a time-saving boost.

This is a, what hidden gems do I have in my archive boost?

You know, that latter, that latter use case, by the way, is, is really spectacular, but

very hard to put a number to and hard to measure.

You know, how much money do I make from the hidden gems?

The things that I didn't even know I had in the first place.

Chris Lacinak:

And I, and sports organizations are interesting.

They've always kind of been at the leading edge, I think when it comes to, um, creation and utilization of metadata in service of analytics, statistics, fan experience.

I mean, we think about Major League Baseball was always doing great stuff.

NBA has done some great stuff.

I mean, it's, and, and they have something going for them, which is a certain amount

of consistency, right?

There's a structure to the game that allows there's, there's known names and entities

and things.

So, um, so that does make a lot of sense.

And it seems like it's just ripe, uh, for, for really making the most of something like

Wasabi AiR.

I can just see that being a huge benefit to, to organizations like that.

Um, are you seeing, can you give us some examples?

Are there other, um, maybe non-sports organizations that are, that use cases that are using Wasabi

AiR?

Aaron Edell:

Yeah, definitely.

Um, I'll give you one more sports one first though, because there there's, you know, the, the use case I gave you is, is about creating content and marketing content for channels

and for consumption of consumers.

But they also are, you know, especially teams, individual teams are very brand heavy in the

sense that they, you know, they seek sponsorship for logo placement in the field or the stadium

or whatever.

And AiR is used for, by sports teams to look at that data and basically roll up, hey, the

Wasabi logo appeared in 7% of this game and the Nike logo appeared in 4% of this game.

And then you can go to Nike and say, Hey, do you want to be 7%?

You should buy this logo stanchion or whatever.

So really interesting use cases there, but non-sports use cases.

So one of my all time favorites is a, uh, a company called, uh, Video Fashion and Video

Fashion has a very large library.

I think it's on the, to the tune of 30,000 hours of video footage of the fashion industry

going back as long as video can go back.

And they, um, and, and a lot of this was on videotape and needed to be digitized.

And I think they still have a lot that still needs to be digitized, but they used Wasabi

AiR back when it was called Curio, um, basically to kind of, you know, auto tag and catalog

these things so that when they get a request for, and they licensed this footage, right?

So this is how they make money.

This is how they monetize it.

This is why I like this use case because it's a very clear cut monetization use case where

they sell the, you know, they licensed this footage per, I want to say per second probably.

And they, and so Apple TV Plus came to them one day as just an example and said, Hey,

we're making a documentary.

It's called Supermodels.

Do you have any footage of Naomi Campbell in the nineties?

It took them like five seconds, right?

To bust out every single piece of content they have where not only does Naomi Campbell

appear, but her name is written across the street.

Somebody talks about her, right?

So it, it's literally like a couple seconds.

Yeah.

And then they just, they license it, right?

So they, they get all this revenue and have very little cost associated with servicing

that revenue.

And that's exactly the kind of thing we want Wasabi AiR to empower.

You know, it's time is money, my friend.

Yeah.

We're saving time.

Chris Lacinak:

I love, one of the things I really like about Wasabi AiR is that it allows you to do sophisticated

search where you can say, I want to see Naomi Campbell. I want it in this geographic location.

I want it at this facility and wearing this color of clothing or something, right?

Like you can put together these really sophisticated searches and come up with the results that

match that, which I think is just fantastic.

I think that is, that is the realization of what the ideal vision is for being able to

search through audio visual content in the same way that we search through Word documents

and PDFs today.

I mean, that's, that's, that's fantastic.

I'd love to dig into like, let's dig, let's make this a little bit more concrete for people.

We haven't really talked about exactly what it is.

We've got this high level description.

But let's jump in a little bit more.

So, so folks that are going to use Wasabi AiR would be clients that store their assets

in Wasabi, in Wasabi storage.

Is that a true statement?

Aaron Edell:

Yes, they, they can be existing customers or, you know, new customers. But yes, you need to, you need to put your stuff in Wasabi storage.

Chris Lacinak:

You've got your assets in Wasabi storage.

How do you turn Wasabi AirRon? Is it something that's in the admin panel?

How does that work?

Aaron Edell:

Not yet.

I mean, that is, that's where we're working towards. Right now, you reach out to us, you know, reach out to your sales representative or,

you know, honestly, on our website, I think we've got a submission form, you say, I'm

interested, this is how much content I have.

And you don't have to be a Wasabi customer when you reach out, right?

Like, we'll help you sort that, sort that out.

But essentially, when we will, we'll just, we'll create an instance for you of Wasabi

AiR.

And when we do that, we'll attach your buckets from your Wasabi account, and it'll start

processing and basically, you'll get an email or, you know, probably an email with a URL

and credentials to log in.

And when you click on that URL and log in, you'll have a user interface that looks a

lot like Google, right?

It's, it's, there's, you know, some buttons and things on the side, but essentially, right

in the center is just a search bar.

And we want it to be intuitive, of course, obviously happy to answer questions from folks,

but you should be able to just start searching, you know, we'll be processing the background

and maybe you want to wait for it to complete processing, it's up to you, but you can just

start searching, and you'll get results.

And those results will sort of tell you, you know, some some basic metadata about each

one, there'll be a little thumbnail.

And then let's say you search for the word Wasabi.

And maybe you specified just logos.

I just want where the logo is a Wasabi, not the word or somebody saying Wasabi.

When you get the search results, let's say you click on the first one, you'll have a

little preview window and you can play the asset if it's a video or audio file, right?

We have a nice little, you know, proxy in the browser.

And then you're going to see all this metadata that's all time line, timecode accurate along

the side.

And you can kind of toggle between looking at the speech to text or looking at the object

tags, and then on the bottom will be a timeline kind of like a nonlinear editor, be this long

timeline and your search term Wasabi for the logo, you'll see all these little like kind

of tick marks where it found that logo.

So you can just click a button and jump right to that moment.

And what I like about that is so let's say, let's say in the use case, you're trying to

you're trying to quickly scan through some titles for bad words, or for nudity or violence

or something like that.

Those, you know, those things will show up and you can just in five seconds, you can

just, you know, make go through them and make sure they're either okay or not, right?

Like sometimes, for example, it'll, you know, it'll give you a false positive.

That's just what happens with machine learning.

But it doesn't take you very long.

In fact, it takes you almost no time at all to just clear it and just, you know, go through

and then if you want, you can even edit it and just remove it or add a tag or something.

So let's so hopefully that gives a good picture.

Aaron Edell:

Yeah, so well, and I'll ask this question, because wasabi is so transparent about pricing.

You've mentioned $6.99 per terabyte. Is there is there transparency on that level yet with AiR?

Or is this still something that's in motion?

Or?

Aaron Edell:

Yeah, we're still we're still working on it.

But we do have a kind of a we, we came out with a pricing for NAB, we're calling it the NAB show special.

So you know, get it while it's hot, I guess, because we probably will have to change it.

But it's just $12.99 a terabyte per month.

So think of it almost like a different tier of storage, although, you know, it's, it's

the same storage, it's just that you have now all this indexed metadata.

Chris Lacinak:

And is that $12.99 per month on top of the $6.99 per month? Or is that inclusive of so $12.99 total?

Aaron Edell:

Exactly.

Yeah, which is still cheaper than I think the 20 or 30 bucks per terabyte per month for just the storage for some of the hyperscalers.

So you know, even even if you didn't use air, and you were just paying for the storage,

it's still a lot, a lot less expensive.

And there's no egress and no API fees and all that.

Chris Lacinak:

Yeah.

So in the I mentioned the project that I was working on before called AMP, we, we call we came up with the term MGM, which stands for metadata generation mechanisms.

And this is to say speech to text or object recognition or facial recognition, as would

all be things we called MGMs, right?

Do you have a term for those?

What do you call those

so I can refer to them the way you do?

So this is so funny you ask, because when we we when we started gray meta, we had so

much fun trying to come up with that term.

And the original product was called haystack.

Because we thought you're going to find the needles and I like that.

Right?

Chris Lacinak:

I like that.

Aaron Edell:

Yes.

So so how do you find a needle in a haystack? Bring a big old magnet.

So we called those things magnets at first.

You'd have a magnet for speech to text or whatever.

I think I think they were still called magnets by the time I left.

When I came back, we were calling them harvesters, which are maybe gosh, extractors, maybe extractors.

Okay, so but but since we joined Wasabi, I think we've just been referring to them as

models honestly, models, not all of them are machine learning models, but you know, okay,

Chris Lacinak:

well, I just just so we have a term for this discussion.

And I'll use the term models then to talk about that. So so can you tell us what models you have built into air right now?

Aaron Edell:

Yes.

So right now, we have speech to text, which is outstanding and understands I think, 50 languages and will translate it even to English, as well as do a transcription in the native

language.

We have an audio classification engine, which, you know, basically tries to tell you what

sounds it hears, you know, coughing, screaming, gunshot, blah, blah, blah.

We have a logo and brand detection system, which we just trained ourselves from scratch

and is very good, actually, I'm really surprised because that that's that was when we were

doing this before, it was a really hard problem to solve.

It still is, but we actually got it working.

Then we have an object recognition model, which will essentially try to tag things that it

sees lamp post shirt, beard, that kind of thing.

And then we've got OCR optical character recognition.

So words that appear on the screen get turned into metadata.

And then we've we've got we call it we call it technical cues.

So this is very specific to the M&E industry, but bars and tone, slate, titles, that sort

of thing.

And then faces and people.

So, you know, we will we will detect faces and then kind of kind of like how in iphoto

on your phone, like it'll it'll say, who's this?

Right.

Here's a bunch of photos of this person.

Who is this?

Same thing.

Right.

We group unknown faces together.

You can type in who they are.

And then going forward, you basically have names associated with with faces.

Right.

So it's a very, very simple system.

Chris Lacinak:

And if I remember right from the demo, you can also upload images of individuals that

you know are going to be in your collection and and identify proactively. Right.

Like if for myself, if I could I could upload three photos of myself, say this is Chris

Lacinak and then it'll use that.

You can do it ahead of time.

Aaron Edell:

Exactly.

Yeah. Yes.

So if you know the people ahead of time, you can do it, too, which is which is really useful.

Chris Lacinak:

I like that feature.

I mean, that's another thing that is similar with AMP is just the concept of using non audiovisual materials in order to train models on to describe audiovisual objects.

So the OK, that's great.

And do you do you are those are all of those models, things that Wasabi has built that

are owned by Wasabi?

Or are you connecting to other providers of AI services?

Aaron Edell:

We built we built all our own models, all homegrown.

This was this was my this was my big change when I came back to GrayMeta because I had experience doing it.

I knew it was possible and I didn't think that relying on third party models was a good

idea.

I mean, obviously, for intellectual property reasons, but also it's just really expensive

to do it that way.

We wanted to make it just basically I don't want to I don't want to say cheap, but we

wanted to make it economical for people.

Right.

Because that was a major barrier.

If you are, you know, a large library, you could have millions of hours of footage.

And if you're paying the hyperscalers, which charge like 50 bucks an hour in some case.

I mean, what are you going to spend 120 million dollars on money on

AII tagging? Probably not.

So so we built all our own.

And the reason we were able to do that, and by the way, like, don't think you can just

go on to Hugging Face and pull down a model off the shelf and just pop it into production.

I have seen that you can't do that.

And the reason is because, you know, a lot of those models are trained on not media and

entertainment.

They're trained on other world things and they don't work.

They don't their accuracy drops when you're talking to people like L of C or you're talking

to to, you know, you know, pick pick your pick your broadcaster, pick your network,

pick your pick your post house.

When you're talking about media and entertainment content, they need to be trained for that.

And then you got to build in pipelines and we had to do all kinds of stuff to make it

efficient, because there's a lot of really cool machine learning out there that's very

advanced but very expensive and compute intensive to run.

And that's also not going to work for customers.

They can't spend 50 bucks an hour on their machine learning tagging.

It's not it's a no go.

So we've put we've put years of experience into our models and and also understanding

like what to expect on the other end.

I there's a there's a guy who works for me, Jesse Graham.

He's been doing this for so long that you can give him any machine learning model now.

And he can just he he knows he knows the pieces of content that's going to throw it for a

loop and he can see the results and he knows customers are going to either be OK with this

or not.

Chris Lacinak:

Yeah.

Aaron Edell:

And that that experience is so valuable to us because it gives it lets us quickly iterate.

It lets us go to market with with production models that actually work for customers. They're not just cool demos.

You know, they're not just kind of fluffy fun things.

They're real.

They have real value.

And that's why we spend so much time building our own models.

Chris Lacinak:

Do you have feedback or requests for the Dam Right podcast?

Hit me up and let me know at damright@weareavp.com. Looking for amazing and free resources to help you on your DAM journey.

Let the best DAM consultants in the business help you out.

At weareavp.com/free-resources.

And finally, stay up to date with the latest and greatest from me and the DAM Right podcast

by following me on LinkedIn at linkedin.com/in/clacinak.

And let me ask related to that, talking about training it based on media and entertainment

broadcast content.

How how have you found it to work or have you done testing on archival content stuff

that's not production necessarily like production broadcast quality, always highly variable,

maybe lower quality audio and video stuff like how how how is it performing on that

sort of content?

Aaron Edell:

Surprisingly well, actually, I'll give you an example.

So Steamboat Willie, which is now in the public domain, you know, practically an ancient piece of animated content featuring, I think, the original appearance of Mickey Mouse, although

I don't think he was called Mickey Mouse back then.

Anyway, there there's it correctly identifies the boat is a boat.

You know, it the object recognition, surprisingly, is able to tag things that are animated and

in black and white.

I have I have also seen it pick up logos that are on almost undetectable by human eyes.

So we had so much fun showing this off at NAB because I we Wasabi sponsored the Fenway

Bowl recently.

And so we had we had the Fenway Bowl.

We ran we ran it against wasabi air.

And there's obviously a ton of logos everywhere.

And so there was this one logo golf, I think it was Gulf Oil or something like that.

And I would show it.

So I'd pull it up on the screen and I would click and jump to that moment.

And I would say, OK, everybody who's watching me do this demo right now, tell me when you

see the Gulf Oil logo in the video.

And they're like squinting and, you know, most people don't see it.

But if you kind of expand it and zoom in, it's just there, teeny tiny little thing in

the background.

So, yeah, I've I've been I've been really pleased with a lot of of where machine learning

has has how far it's come in terms of the research that's gone on behind it.

The you know, the the embeddings and weights that you that people are open sourcing and

making available.

It's just extraordinary.

Chris Lacinak:

Yeah.

Let me dive into the weeds a little bit here about kind of the the the models and things. I'm curious, I mean, one of the things that we developed in AMP and I'm wondering, I know

that you had to have thought about this and I'm curious where you've arrived and what

you're thinking about for future.

But is the concept of workflows.

It sounds it sounds like and correct me if I'm wrong once I'm done saying this, like

I have my I have my videos and my audio and things stored in in Wasabi.

I turn on Wasabi AiR and it runs these models.

It sounded like seven or eight-ish models, I think, in parallel.

But let's say that I wanted to create a something that does speech to text and then runs it

through named entity recognition, sentiment analysis.

Right.

I take and I want to take outputs of one model, plug it into another model and create workflows

instead of just getting the output of a single model.

Where are you at?

Does that exist today?

Is that on the horizon?

What's that like?

Aaron Edell:

Yeah.

So I've experimented with that in some way or another at actually several different companies. In fact, I think at Veritone, we even had like a workflow builder that you could do

where you could sort of drag nodes in and go output from this to there.

The state, I think the way that we're thinking about it today is we just we don't want you

to even have to do that.

So let's pick apart why you're doing that.

So named entity recognition based on speech to text.

It's a really good example.

Like I want maybe I want to search by places.

So speech to text is particularly the one that we've developed is surprisingly good,

is shockingly good at proper nouns and proper names for things.

This is where speech to text in the past has always fell down.

But it's just text.

So the way we think about it is instead of you having to come up with that use case for

that workflow, we're just going to build that in.

So when you're running product and you're thinking about, "Okay, how do I solve these

problems?"

I like to I like and this is a great thing I learned from from working at Amazon is just

put yourself in the customer's shoes, be customer obsessed.

Think about, okay, the editor is sitting down, they got to do their job.

They want to get shots of the Eiffel Tower or something or maybe just I don't know, I'm

trying to think of a better example of that.

Because if you search for Eiffel Tower, you just show up.

But named entity recognitions like companies or something like that.

Maybe I'm looking for people.

Okay, I got it.

When people are talking about Wasabi the company and not wasabi the sushi sauce, right?

I want to differentiate.

So normally, if I search for the word Wasabi, obviously, all references will show up.

We are going to give you an experience where that is just seamless, right?

It's a new option.

Just search for Wasabi the company or I'm doing named entity recognition on the speech

to text.

That's how we might solve it in the back end.

We may solve it some other way.

There is a lot of the whole machine learning pipeline thing is what's really evolving.

Like for example, our audio classification and speech to text are one multimodal model,

for example.

So there's this kind of newer world of these end to end neural networks that are really

good at doing different things.

Instead of in the old way, which is what you described where we would kind of have the

output of one and go and make it be input in another, that kind of ends up being like

a Xerox of a Xerox of a Xerox sometimes.

So we're building kind of more capabilities around combining these things into one neural

network so that A, it's way more efficient and B, it's more accurate.

So that you're going to see from us in the coming months, a lot of innovations around

that and with the express goal of doing what you described, which is just better search,

better, more contextual, more accurate search.

Chris Lacinak:

Well, I have to hand it to you.

I mean, I think what you gain with sophistication, you kind of add a burden of complexity. And right now I've seen the demo of Wasabi AiR and it is elegant in its simplicity.

I can totally understand aiming for simplicity.

That's going to be a better user experience.

So yeah, that's interesting.

It'd be good to maybe, I don't want to bore our listeners with that, maybe a sidebar sometime

offline we can talk about that.

And another question in the weeds here, I mean, one of the things that I've grappled

with or we grappled with in the AMP project that I'd love to know what you're thinking

about or how you're managing this, if you're able to share is on the efficiency front of

processing efficiency, right?

The concept of running, for instance, speech to text where there's nobody talking, it's

music or BPM analysis on music where there's somebody talking, right?

Facial recognition where there aren't people.

You got the idea here, but trying to really only feed segments of relative things to models,

using your term in order to create more efficient and cost-effective processing.

Is that so negligible?

Is that so processor intensive that it doesn't really pay off or is that an actual model

that I'm now using model in a different way that works?

Aaron Edell:

I know what you mean.

Yeah, I think it does add up. So in the true FinOps cost optimization fashion, once you take out the big things, you go after

the little things because they just add up, right?

If I can reduce some fee that's one cent or something to half a cent, that in theory would

add up.

So it's worth it to think about it.

We do have some of that.

So for example, you mentioned a really good example of that, which is don't run speech

to text if there's nobody talking.

So we actually have a separate model that we call, I think it's called the voice activity

detector or something like that.

So this is what I mean.

It's such a good example of what I was trying to convey, which is that you have to think

about these things when you're doing this in production.

And these are the things that drive efficiency to make it actually viable for customers to

pay for and use.

So when we first started building our own speech to text, we just plopped it in, we

ran it and my goodness, it was so slow.

And the accuracy was great, but it just was not going to work.

Over time, we built the pipeline better.

We introduced VAD that greatly improved the accuracy of the timecode markers for the speech

to text, as well as improved the overall efficiency.

I mean, I don't want to get in trouble for this, but I think we improved the efficiency

by a hundred times.

Think about that.

Chris Lacinak:

Yeah.

Aaron Edell:

That's a huge, huge difference.

And that's just basically trial by fire in some ways. I mean, I believe in iterative product design.

I don't like to sit around for six months and try to build the perfect product.

I like to build little things and iterate quickly and learn.

And that was one of the first things that we learned when we first started doing speech

to text.

And we just iterated it and made it faster, faster, faster until we got to this super

efficient state.

So yeah, for an in the weeds question, that was a really poignant one because it is where

I think the value of AiR comes from and perhaps other systems that are trying to accomplish

the same thing is when you build your own machine learning, there's a lot of things

you got to think about and it's hard to know what they're going to be up front.

And it's taken us years to get it right.

Now, it doesn't necessarily mean it'll take everybody years.

You can always learn, but it's a trial by fire.

Chris Lacinak:

It makes me think of Formula One racing with hundreds of little tweaks to these vehicles

to make things just get a 10th of a second faster or something. Right.

Let me jump over to the questions around ethics and AI.

And I'm going to break that into a couple categories to kind of go off of here.

I guess, you know, when it's come up, typically there's one around bias, how do the AI models

in this way perform across a variety of contexts?

Another is around intellectual property.

Like here we think of in Chat GPT now, I can buy the business license in which my content

that I'm feeding it is not going to train the model, right?

As opposed to the free or the cheap one where my data that I feed it goes to train the larger

model.

Can you talk about how you are thinking about and acting on those sorts of ethical questions

today?

Aaron Edell:

Absolutely.

You know, for me, machine learning is not a means to an end. So I kind of like to use the analogy that, you know, I don't go around talking about

how Wasabi AiR is built on electricity, right?

Like that doesn't make sense.

Electricity is a technology that we kind of take for granted.

Machine learning solves the real problem that I'm trying to solve, which is I don't want

people to have to lose content in their archives.

I want people to be able to find stuff quickly and be able to get it out the door.

And I want editors to just have a wonderful life, not be miserable.

And so I think about machine learning in that sense.

I don't think about it as a, hey, we're going to try and scrape as much data and make the

best overall models and make money by selling machine learning, if that makes sense.

So I think your motivations for your ethical use of AI start there.

The bias thing is really interesting, and I have to hand it to, I mentioned my Machine

Box co-founders before, Mat Ryer and David Hernandez.

David Hernandez, brilliant computer scientist, he really taught me a lot.

And one of the things that he pointed out to me was, and this was back in, we were doing

Machine Box in, I don't know,:

turn words into vectors.

And this is important because for the listeners who don't know what that means, basically

take the word frog and take the word toad.

Now instinctually as humans, we know that those are a lot closer together in concept

than the word frog and the word curiosity.

So vectors attempt to kind of do the same thing.

We take basically every word, and this is, you have to picture a thousand dimensions,

right?

It's not a three-dimensional thing, it's like thousands of dimensions.

But basically in these thousands of dimensions, we can do math to figure out that the word

frog and the word toad are very close together.

And this helps us in search.

So if I search for toad, I get pictures of frogs because they're very relevant.

Now those systems, a lot of these embedding and vectorization systems were trained, at

least back in the day, and I'm pretty sure this has been addressed, but they were trained

on news articles and written material from humanity ranging all the way back.

So what happened was that if you actually look at the distance between, for example,

the word doctor and the word man, much closer than doctor and woman, and the inverse was

true for nurse and man and nurse and woman.

Now that's a bias.

That bias came from the training data, which is again, I think was a lot of news articles

written over the last 70 years or something like that.

So what you end up with is a machine learning system that's just as biased as humans are

or have been in the past.

And they don't necessarily reflect our inclusive nature and how we want our society to exist

where we don't want that bias.

That's not something we want in our machine learning because we're using our machine learning

to solve problems in the real world and it doesn't reflect the real world.

So I think about that a lot and I think about how can we improve our machine learning.

Now it's the training data.

It's not the machine learning models.

It's the training data.

So we as humans have to go back and fix that in the training data and do our best to think

of those things ahead of time.

And there's ways, there's tools to process your training data in certain ways and look

at patterns and things like that.

And you can detect that kind of thing.

So I'm always thinking about that and I always want to make it better.

And it's probably an ongoing challenge that's never going to really end, but something that

we have to pay attention to.

Ethically, like any technology, any new technology, what I'm about to say could be applied to

nuclear physics.

It could be applied to electricity.

It could be applied to taking metal and making it sharper.

Don't use it for bad things.

Your intentions, like I mentioned before, my intention is to make people's lives at

their jobs, in particular media and entertainment editors and marketing people and these professionals,

I don't want them to have to sit around trying to find stuff.

I want to make them immediately find the thing they're looking for and deliver the content

and the value that they want.

That's my purpose.

If your purpose is to go around electrocuting people or dropping nuclear bombs or stabbing

people, you're going to use these technologies in the wrong way.

So I don't mean to say that we all have to just be responsible for our own actions.

I think we do, but the rules that we come up with, scientists have rules around bioengineering,

for example.

There's laws against you can't patent certain molecules, you can't patent DNA.

Those things are being challenged all the time.

But I do think that we can collectively as a society agree that we're not going to use

AI for these purposes, even though some people will.

You can't legislate bad guys out of existence.

They will be there and they will test it.

But I think the more educated we are about it, the more we can tackle it.

But I don't think that means we have to stop using AI or ML or we can't innovate and we

shouldn't innovate and we shouldn't see where this can go.

I think that's equally as dangerous.

Chris Lacinak:

I've got a question that's a little bit out there, but if you don't have a response to

this, I don't know anybody who does. So you're the best person I can think of to ask this question.

And that is about the prospect of a future in which the machine learning models, and

here I'm not talking about models as in things that generate metadata, but the machine learning

model as in the thing that you train over time, are interoperable.

Is there a future in which I go to Wasabi as an organization, my data is there, I

spend years training it and cultivating that data, not just the data, not the output of

just the metadata, but let's say the machine learning that we do over time and training

the models and giving it feedback and maybe triangulation of that data, that God forbid

Wasabi goes out of business in 20 years, that I could take that and transfer it to another

entity that has machine learning.

Is there a future in which such a thing exists or is that not even on the horizon?

Aaron Edell:

Well, today, I'm a very customer obsessed person.

I mentioned that already. And I think if I'm the customer, when I spend effort and time training a machine learning

model, let's say in Wasabi AiR, which you can do, you can train it on people and soon

you'll be able to train it on other things.

I'm putting my effort and my data into that.

I should own that.

And I believe in that.

So we segment that all off.

We don't aggregate people's data.

We don't look at the training data and make our own models better.

You own it.

It's your data.

If you trained it, it's yours.

But I think that it would be hard, just the nature of the technology itself, it's hard

to take all that training and shuffle it off somewhere else.

I mean, I guess in theory, there's like embeddings and vectors and stuff like that and you could.

I think more likely over time, you won't have to train it.

I think our models will get better at context.

They will be larger.

They'll have more parameters.

But I also think that they'll get more specific and I kind of like this agent approach that's

kind of emerging where, let me put it this way.

I do not think that artificial general intelligence is anywhere near happening.

I mean, I think people will change their definition of what that means to kind of fit their predictions.

But I don't think that we're in danger of one very large AI model that just does everything

and takes over humanity and kills us all.

Or I don't know, who knows, maybe they'll do something wonderful, like help us explore

other planets, whatever.

I think what's more likely is that we will get better at segmenting off specific tasks

and making machine learning models that are just very, very, very good at that and then

orchestrating that, which is kind of what Wasabi AiR does today.

But I don't think the need for training it is interesting because if you asked me this

question back in:

machine learning, which is that your machine learning model should be trained on the data

that it's expected to run against.

You should not be able to tell the difference.

And this was kind of at the time when synthetic training data was emerging and you can't beat

a human curated, really, really clean, really good data set.

You can't beat it.

And today I think that that might be changing a little bit and that the need to train models

to be more specific or to train it on your own data is not heading up.

I think it's probably going down.

In fact, we already see some of it a little bit.

Like, you know, take, okay, great example, the Steamboat Willie example.

It used to be that you would have to train your object recognition system to recognize

animated objects as kind of custom objects.

We have been experimenting with some machine learning that we haven't put into air yet,

but we might at some point where you don't have to do that anymore.

In fact, it actually interprets your search in a different way.

So if I searched for, let me put it this way, like it would process a picture.

Let's say it takes a picture of the two of us talking and I have a beard and you don't

have a beard.

And I sent it to this system and processed it.

Instead of coming back with brown hair, beard, blue shirt, microphone, right, this whole

list of things, it just sits there.

Then you ask it, is there a microphone in this picture?

Yes, there is.

Here it is.

Is there, and this is what I like about it because the words that we use can be very

different.

So is there a mustache?

Yes, there's a mustache.

And it draws a line just around this part of my beard.

Instead of saying the whole thing is a beard, right?

Or it's using an LLM to interpret the question rather than trying to seek custom training.

And it has a fundamental deep understanding of the picture in a way that we don't understand

as humans, right?

It's broken it down into vectors and things that are just basically math.

And when you ask it, is there a green shirt here?

It interprets your question and goes, okay, this vector over here kind of looks like a

green shirt.

I'm going to say there's a 60% chance that that's what it is and draw a bounding box

around it and there you go.

I think that's the future.

I think that's where we're going.

Machine learning models that are specific, but way more contextual and understand images

and video and data in ways that we can't, but can be mapped to concepts that we as humans

think about.

Chris Lacinak:

And somewhat related, kind of pulling several of these strings together, like the question

around humans in the loop, like we've done a lot of work with the Library of Congress and Indiana University, that AMP Project kind of had at its core that humans in the loop

as far as these workflows go.

And some of that was quantitative.

It was about, for instance, taking the output in a given workflow, taking the output of

speech to text, reviewing it by a human, editing, correcting, and then feeding it back sort

of thing.

Some of it's qualitative.

It's about ethics.

There are some sensitive collections that need to be reviewed and make sure that they're

described properly and accordingly and things.

And I guess I wonder, do you think about that in the work that you're doing?

One, it sounds like some of what you just said makes it sound like the quantitative

aspect of that is becoming less and less important as things improve dramatically.

But I wonder, do you think about humans in the loop with regard to what Wasabi offers,

or do you think about that as something that's up to the user post-Wasabi processing to manage

themselves?

Aaron Edell:

No, I think about it all the time.

In fact, one of the bigger initiatives that we have, and we are still working on it very much, is a frictionless human in the loop process with your data.

So in spite of what I just said, I still think that you need to be able to teach it things

based on your data and correct it, and it should learn.

We do that with faces today, for example.

That's a really good example of this, but it's solved.

Where we want to take it is some of the other things you mentioned.

So improving proper noun and proper name detection, improving the way it detects certain objects

and things in your data, because maybe you're NASCAR or something, and you just have a very

specific content with objects that are, in the broader perspective, kind of strange,

but in your perspective are very set and usually the same or something like that.

You should be able to use your own data and say, "Yeah, that's what this is.

This is a tire.

This is this car."

And we actually do have it in the system.

We've just disabled it for now because I want to make it so seamless that you don't even

really know what you're ... You don't even really think about it as training machine

learning.

Just like ... I really love the Apple Photos example.

They just do such a good job with faces.

I don't know if you have an iPhone.

I'm sure Android does the same thing.

Just go in your photos and it's like, "Hey, who is this guy?

Who is that?"

Brilliant.

It should be very similar.

"What is this?

I don't know what this is.

Tell me what this is."

So I think about that a lot.

I definitely see ... There is just no better arbiter for accuracy in machine learning and

data sets than humans, ironically.

You have to, as a human, make some decisions.

For example, back in:

I bet I could train a classification engine to tell if a news article was fake news or

not fake news."

ake news was a big problem in:

I went about to try and train it.

Basically that meant creating a data set of fake news and not fake news.

I wrote a lengthy blog post about the details, so I won't reiterate it here.

What I ended up figuring out was that, as a human, I have to decide what is fake news.

How do I ... Is it satire?

Is it factually incorrect information?

There's all these subcategories.

I just had to figure out where do I draw the line.

The machine learning ended up working best was when I drew the line in the data set had

bias.

What I was really doing was training a bias detection system.

So it was able to tell if this article was written in a biased way or an unbiased way

and rank it.

That journey for me was really telling about how data sets get made to train these machine

learning systems in the first place.

You really cannot mess up.

This is where the human in the loop problem or question can become a problem and you have

to think about.

If I am surfacing, "Hey, what is this logo?" and you get it wrong and the next guy gets

it right five times, you've caused a problem in your machine learning because you now have

a dirty data set.

So you need to think about that.

How do I keep it clean?

How do I check that this work that's been done is actually accurate?

That's part of the reason why we're spending so much time thinking about it is we want

to get that experience right.

Chris Lacinak:

So that's on the horizon, it sounds like.

That's great. Look forward to seeing that.

And users of Wasabi AiR, you have, as we mentioned, a user interface within Wasabi's GUI, but

is there APIs that can push this out to other systems?

If people generate the metadata in Wasabi AiR, can they push it to their DAM?

Aaron Edell:

Absolutely.

In fact, we're in the talks with several MAM systems right now. I think that IBC, which is in September, will be able to announce some of them, but we want

people to do that.

The vision for Wasabi AiR and for Curio prior to the acquisition was always that this is

a sort of data platform with APIs.

In fact, our whole UI consumes our own APIs.

That was really important for us and that was a wise decision that was made before I

came back to GrayMeta because at the end of the day, you know this, in the DAM world,

in the MAM world in particular, man, you can go in a lot of directions with a MAM.

You can get bogged down in the tiny features and all of the requests that customers want.

And I think that's why so many MAMs today are kind of like rubber band balls.

They have a lot of features and they're all different and they all have different buttons

and they can be very confusing.

It's really hard to keep something simple when you're sort of serving all of those use

cases and trying to build a thousand features, one for each customer.

I don't want to be in that business.

So I think we've got a great tool that gets you what you need off the ground right away.

Some customers have described it as a great C-level tool as well.

We just need some insight into this archive for our managers or for these certain groups

of people.

But the people who use MAMs and DAMs and really use them, they should have access to the metadata

too.

And so they will.

Chris Lacinak:

Yeah.

Well, let's talk, I think what I see when I look at Wasabi AiR is a blurring of the lines between what has been storage and DAM and MAM, but also between storage and other

storage providers that offer AI and ML tools.

Right?

So I'd like to, let's touch on each of those for a minute.

Wasabi AiR brings to the table something that is in many ways, not new, right?

Google Cloud and AWS, they have a suite of tools that you can use to process your materials,

but it is new that you turn on the switch and it does it automatically.

You don't have to go deploy this tool and that tool and put together workflows and things

like that.

Is that the main difference between, is that how you would describe the difference between

what Wasabi is doing today and what AWS is doing today?

Aaron Edell:

Absolutely.

Yes. I mean, I feel like I don't even need to continue talking, but I will, because I think you described

it pretty perfectly.

We want it to be very simple and elegant and we kind of want to redefine what object storage

is.

What is, especially cloud object storage, like what criteria defines cloud object storage?

And having a metadata and an index that's searchable, I think is, we're hoping is going

to be the new definition because it is really hard to solve this other ways.

I mean, there are other similar tools, but yeah, if you use the hyperscalers, first of

all, it's an API call.

You still have to process your video, transcode it, and in some cases, chop it up, post each

of those pieces, or in other cases, you can send the whole file, but I think it depends,

to an API endpoint, get back that metadata and then what, right?

Like it's a JSON file.

And then, so if you want to view this metadata on a timeline and make it searchable, there's

a whole stack you need to build with open search or some kind of search index incorporated.

You need to build a UI.

You have to process and collate all that metadata.

You have to keep track of where it came from, especially if you're chopping stuff up into

segments.

And yeah, you end up building a MAM.

Chris Lacinak:

It's complicated.

Aaron Edell:

Yeah, it's complicated.

Exactly. I do think that the value of just being able to just turn it on, like here's my storage,

press a button, and now I've got this insight.

And if I want, I can hit the API, get the metadata into my existing MAM, but I also

have an interface, a search bar, a Google search bar into my archive just without having

to do anything.

I like that.

I like that solution.

Chris Lacinak:

Yeah.

It makes a lot of sense. And I suspect that there will be others that follow suit, I imagine.

Aaron Edell:

Probably.

Chris Lacinak:

So tell me about the blurring of the lines between the dams of the world and Wasabi,

because you're now, there is, this creates an overlap of sorts. How are you thinking about that?

What do you think it means to the evolving landscape of digital asset management?

Aaron Edell:

Yes.

It's definitely a heady topic. And I think that the MAM world has always been a world that both fascinates me and terrifies

me at the same time.

When we were at Front Porch Digital, for example, we integrated with all the MAMs that existed

at the time.

And I remember going to various customer sites and they would show me their MAMs and I was

just like, "Oh my God, this is so complicated.

I don't know how do you use this?

There must be all kinds of training and everything."

And they were very expensive.

Very, very, very, very, very, very expensive to implement.

We had our own, we built our own MAM light.

We always called it a MAM light called DIVA Director.

And this is, Diva Director is kind of where I think I get my idea of what a MAM should

be from, but it's not.

MAMs have a purpose.

There's a whole world of moving files around, keeping track of high res and low res and

edits and all that, that I am willfully ignoring at this point because that is important.

And it is complicated and there are wonderful MAM tools out there to solve all that.

But when I think about these customers that I spent so much time with, the Library and

Archives of Canada, the Library of the United States Congress, the Fortunoff Archive, the

USC Shoah Foundation, all of these archives have a kind of somewhat finite archive.

Now there's stuff that's new, that's born digital, and maybe they have parts of what

they do that, if you think about like, I don't know, NBC Universal are always making new

stuff, but they also have an archive.

And the people who are thinking about and maintain the archive have kind of different

use cases from other people.

So when I think about blurring the lines, I really think about the customer.

Like what do they need?

When they wake up and they go to work, what do they have to do with their fingers and

their hands and their brains on their computer?

And if it's, you know, manage an archive, be the person who can fulfill requests for

content, help other business units find things.

I think an application like Wasabi AiR is probably sufficient.

Now there's always new, there's always features and things that can be added and improvements,

but I don't want to take it beyond that.

Like I don't want to go further into the MAM and DAM world because I think that those existing

systems are way better than anything we could build for those purposes.

Chris Lacinak:

So it sounds, yeah, I mean, you look at a lot of dams, you know, there's complex permission

structures and a lot of implementation of governance and things like that, that Wasabi AiR doesn't do.

So in those cases, it sounds like Wasabi AiR could serve the purposes of some folks who

don't need a dam or mam otherwise.

And in other cases, Wasabi Air is populating those dams or mams to help them, give them

the handholds, the metadata for improving search and discovery within their own systems.

Aaron Edell:

Exactly.

It's exactly, it's a source of more metadata and it's sort of a window into your objects that maybe your other MAMS don't have.

The other important thing too, is if you flip it, if you think about like S3, right?

If I have, and we've had customers who have had S3 buckets with hundreds of millions of

objects in them.

If you go into the AWS console, into the S3 console, there's no search bar, right?

That's not part of object storage, you know, because it's a separate concept.

I mean, it's, you know, and you have to solve it with technology.

You can't just search your object storage with no indices or anything like that, that

otherwise it'd take a million years.

So I feel like that's where we sit.

We are saying Wasabi AiR, Wasabi object storage now has a search bar.

That's it.

Chris Lacinak:

We focused heavily on audio and video today. Does Wasabi AiR also work with PDFs, Word documents, images, just the same?

Aaron Edell:

It does. Okay. It does.

And it's a good point because those open up, being able to process that opens up whole

other worlds, you know, that we don't spend a lot of time thinking about, but we will,

we're going to start.

Because, you know, and video and audio too is not just limited to media and entertainment

as well.

I like to think of, for example, law firms and, you know, maybe there's a case and there's

discovery and they get a huge dump of data.

And that data might include security camera footage of a pool gate or, you know, video

or interviews and depositions and not just all the PDF.

And I think, you know, if you were opposing console and you wanted to, you know, give

these, this law firm a really hard time, send them boxes of documents and, you know, you

can't search boxes and boxes of documents, right?

There's no insight into that.

Or say, oh yeah, I'll scan it for you.

You scan it and you send them PDFs, but they're not, they're just pictures, still not searchable.

So I think making PDF searchable, making Word docs searchable, pulling out, you know, images

that might be embedded in these things, processing those with object detection and logo recognition

and all sorts is a very valuable space that Wasabi Air does today.

You just got to put it in the bucket.

Chris Lacinak:

Well, Aaron, it has been so fun talking to you today, geeking out.

Just it's really exciting. And your career path and your recent accomplishments have been just, you know, game changing, I

think.

Thank you for sharing your insights and being so generous with your time today.

I do have one final question for you that I ask all the guests on the DAM Right podcast,

which is what is the last song you added to your favorites playlist?

Aaron Edell:

Oh boy.

You know, I have to admit something that's going to be, that's going to divide your audience in an extraordinary way, which is that I actually own a Cybertruck.

I'm also a child of the eighties.

So the whole Cybertruck aesthetic really pleases me.

In fact, if you were to just crack open my brain and dive inside, it's like basically

would be the interior of the Cybertruck.

And the music that would be playing is the kind of a whole genre that I've only recently

discovered because of the truck is sort of eighties synth wave.

So I've recently added to my favorites, some very obscure eighties synth wave music that

I could look up.

Chris Lacinak:

Yeah, please.

Please do. We have a soundtrack where I add all of these songs to a playlist that we share.

Aaron Edell:

So recently I added a song called Haunted by a group called Power Glove.

Chris Lacinak:

Okay. Awesome.

Aaron Edell:

And the Power Glove has a space in it. It's Power Glove.

Chris Lacinak:

Good to know.

Aaron Edell:

Because there's also a band called Power Glove that doesn't have a space.

Chris Lacinak:

Good to know.

We learned yet another thing right at the tail end of the podcast. Awesome.

Well, Aaron, thank you so much.

I'm very grateful for your time and your insights today.

I really appreciate it.

Aaron Edell:

It's my pleasure.

Chris Lacinak:

Do you have feedback or requests for the DAM Right podcast?

Hit me up and let me know at damright@weareavp.com. Looking for amazing and free resources to help you on your DAM journey?

Let the best DAM consultants in the business help you out.

Visit weareavp.com/free-resources.

And finally, stay up to date with the latest and greatest from me and the DAM Right podcast

by following me on LinkedIn at linkedin.com/in/clacinak.

About the Podcast

Show artwork for DAM Right
DAM Right
Winning at Digital Asset Management

About your host

Profile picture for Chris Lacinak

Chris Lacinak

As the Founder and CEO of digital asset management consulting firm, AVP (https://weareavp.com), Chris has spent nearly two decades partnering with and guiding organizations on how to maximize the value of their digital assets.

Hosting DAM Right is a natural outcome of a career that has encompassed playing roles from technical to executive, has included serving as an adjunct professor in a Masters program at NYU, and has consisted of building a company that has consulted with over 250 organizations in almost every sector. Chris brings this background and context with him to produce a podcast that dives into every aspect of digital asset management.