I joined the Rooftop Ruby folks to talk about the MySQL for Developers course I made for PlanetScale.
Collin
00:00:00 – 00:00:13
Hello.
Welcome to the show.
Today, we have a guest.
His name is Aaron Francis, who is a developer educator at PlanetScale.
And so, welcome to the rooftop, Aaron.
Aaron
00:00:14 – 00:00:20
Thank you for having me on the rooftop.
I think that was that was the intro.
It was welcome to the rooftop, and I love it.
Collin
00:00:20 – 00:00:25
Yeah.
That's what we're going for.
Joel, your Joel is also here.
Joel, how are you doing today?
How are you calling?
I'm good.
Yeah.
Excited to, excited to have
Collin
00:00:29 – 00:00:44
you here.
Aaron.
Yeah.
So Aaron just created a 7 hour I think it was 64 video long, free course called MySQL for developers.
Am I characterizing this correctly?
Aaron
00:00:44 – 00:00:53
Accurate accurate so far except for the word just because I feel like I've been working on it for a 1000 years.
But, yeah, accurate so far.
Collin
00:00:53 – 00:00:56
How long have you been working on this for?
It seems like a huge endeavor.
Aaron
00:00:56 – 00:01:17
Yeah.
It really was.
I probably got started maybe 18 months ago, kinda on and off.
I bought my first I've been working with MySQL forever, like most, you know, full stack or back end developers.
But I bought my first book on MySQL maybe 18 months ago, and I was like, man, there's so much here that I actually don't know.
Aaron
00:01:17 – 00:01:36
And so at that point, I started buying more books and taking notes and worked on it on and off for a super long time.
And then I joined, PlanetScale, and that's when I started working on it full time.
So for the past maybe 4 or 5 months, it's been, like, the the only thing I do.
It's my full time job, so a long time.
Collin
00:01:37 – 00:01:44
Wow.
That that's incredible.
How do you even think about structuring something, like, that comprehensive, like, with that much information?
Aaron
00:01:45 – 00:01:56
Yeah.
That was really difficult.
So so the course is called MySQL for developers, and my background is I'm a application developer.
I'm not a I'm not a DBA.
I've never been a DBA.
Aaron
00:01:56 – 00:02:19
I don't want to be a DBA.
I love being a developer.
So, we are kindred spirits.
I'm a Laravel developer.
I imagine y'all are probably rails developers since this is a Ruby show, but I I started reading all these books and realized like, man, maybe 30% of each of these books is useful to an application developer.
Aaron
00:02:19 – 00:02:47
Maybe maybe 25%.
And the rest of it is, like, really helpful if you wanna be a DBA, which is a great career path that I did not want to pursue.
And so as I as I was reading it, I was, you know, highlighting and making notes and doing all of that.
And once I had finished, you know, as many as I had, I kinda started to, like, go back through and look at my notes and try to coalesce it around topics.
And it started it just started with note cards.
Aaron
00:02:47 – 00:02:58
I just wrote down every topic and then kinda started shuffling them around to be like, oh, these three things are kinda similar.
These kinda go together.
And it just took a lot of a lot of time and, massaging.
Collin
00:03:00 – 00:03:13
Yeah.
That's amazing.
And so it's MySQL for developers.
When I was looking at it, the kind of outline seemed fairly general.
How much of it would you say is very specific to MySQL?
Collin
00:03:14 – 00:03:20
And is this you know, somebody's working with a different SQL type database, how much of this would be relevant to them?
Aaron
00:03:20 – 00:03:37
Yeah.
That is a great question.
And one that's super hard for me to answer because I don't know, like, I don't know Postgres very well.
And so in terms of the structure, it kinda starts with, like, building efficient and compact tables.
So that's the schema section.
Aaron
00:03:37 – 00:04:08
So in the schema section, we talk about all the data types that are in MySQL.
And so how much of that applies to Postgres?
Theoretically, a lot of it, but I know that Postgres has a lot more, like, specialized data types.
And so when I'm saying things like in, you know, talking about storing integers, and I'm talking about, well, you know, this this column is 4 bytes and this one is 1 byte.
And so if your data ranges between 0255, put it in this column.
Aaron
00:04:08 – 00:04:41
I just don't know if that's true in Postgres.
So that that's hard for me to say.
It's probably pretty close in MariaDB since that was a fork of my SQL, but it's pretty it's been a long time now.
I think I think one, there's a huge opportunity for some, ambitious content creator to make a post press version of this, which I think if they did it solo, they could make a lot of money off of it.
And 2, I think the, like, the concepts that we talk about are applicable to all databases.
Aaron
00:04:41 – 00:04:57
Like, we go over how do you make your tables compact?
Why is that important?
How do how do indexes actually work?
Like, why would we care about indexes?
So some of that higher level stuff is definitely, a little bit broader than MySQL.
Collin
00:04:59 – 00:05:33
Yeah.
That's what I was wondering because in the sort of top level outline, you have, you know, schema, indexes, queries, and then its examples.
But of that schema indexes queries, that seemed there's probably a lot of overlap there, so I was curious.
So starting at the top then, with schema, what would you hope someone to come away from your course knowing that they wouldn't know going in?
Like, a kind of application developer level mistakes that maybe you would help them steer away from, or am I am I on the right path here?
Aaron
00:05:33 – 00:06:10
Yeah.
Absolutely.
So I think, you know, in in the schema section, which is where we start, like, we don't even start with how do you query your data, which I think is a little bit maybe a little bit bizarre, but I I do make the case for it in the course that, like, you should start with building efficient tables because that's usually, like we usually start by writing migrations to build the, you know, to build the tables.
So why not let's start our learning there too.
I think the thing I would want application developers to come away with from the schema section in particular is you should think about your schema a lot more.
Aaron
00:06:10 – 00:06:36
Yes.
Data modeling is super important, and we don't really go into very much data modeling since that's an entire like, that's an entire course on its own is how do you how do you effectively do data modeling?
But what we talk about a lot is, alright.
Let's take a look at the data that you have or that you expect to have.
And what is the, like, what is the smallest column you can get away with that can still contain all of your data?
Aaron
00:06:36 – 00:07:08
What is the simplest version of a column?
Like, what what is the simplest representation of that data?
If it's a date, you should put it in a date column, not not a string column.
If it's if it's an integer number, you should put it in an integer column, not a string column, because we want the database to be able to work with it as efficiently as possible.
And then finally, the last point, which kinda takes like, it it takes a little bit of time to investigate your data is your schema should represent reality.
Aaron
00:07:09 – 00:07:39
The easiest way to say that is don't make a column nullable if the data is not nullable.
Like, that's the easiest, you know, your schema should adhere to the truth example.
But there are there are a lot more.
So, like, you should look at you should look at the reality of your data, and your schema should match that reality so that the database can work with the smallest, most efficient compact types, and you can trust your schema.
Because what you don't want is something like, okay.
Aaron
00:07:39 – 00:07:58
Well, you know, my data is nullable, but I think having nulls are is bad, and so I'm gonna invent my own version of null.
You know, I'm gonna put NA in all the column or all the all the rows that are actually null.
And it's like, well, you just you lied.
Like, the data's nullable.
Just make just make the column nullable.
Aaron
00:07:58 – 00:08:18
That's totally fine.
So I think that's that's the big takeaway is, like, spend some time with that, that data or what you expect the data to be, because that's gonna set you up.
Like, that is gonna set you up for either success or failure as you move forward into creating indexes and then finally, like, trying to get the data back out of your database.
Collin
00:08:19 – 00:08:35
Right.
So that's even a more basic than data than how you model your data.
Mhmm.
You're at the level of use the correct types which are provided by the database and, sort of don't work against the database, you know, work with it in the way it wants to be worked with.
Aaron
00:08:35 – 00:08:46
Yeah.
Totally.
And I think, an easy example is when, at least in Laravel, when writing a migration, it's easy to say, create an integer column.
Right?
That's very, very normal to do.
Aaron
00:08:47 – 00:09:22
And if you create an an unsigned integer column, the value can go up to, like, I think it's, like, 4,200,000,000 or something.
It's like the the legal range at the top.
And if you know if you know that, like, you're storing a 0 to 100, let's say, score or something, creating creating a column that can hold up to 4,200,000,000, it's not the end of the world.
It's really not.
But you imagine you take that same learning and you apply it to every column on every table across 1,000,000 or maybe billions of rows, and it starts to add up.
Aaron
00:09:22 – 00:09:49
And so that's the kind of thing where it's like, hang on.
Instead of just using whatever column is easiest to get to from your migration file, which is what, you know, I do all the time.
Let's think about what's the legal range and then find the smallest column that that holds that, and doesn't restrict you.
But there's no there's no reason to just use the default when you have, you know, 4 or 5 different integer columns that you can use.
I was just gonna say, as someone who has a background writing web applications in Ruby, I found that the schema part of the course, which is the only bit I've done so far, was just it was really interesting kind of exploring how the types that we use in applications, at least in Ruby where it's very abstracted.
Like, we just have this type integer in Ruby
Aaron
00:10:14 – 00:10:14
Mhmm.
And it can be any length of integer.
Right?
And the interpreter somehow magically works that out for you.
But when it comes to the database, you've got all of these other types.
And I think that the people working on, like, building databases are used to working with languages that have all of these other types.
But you are kind of giving, I think, a lot of people their first introduction to, like, thinking about how many bits an integer might have or, like, whether it's signed or not signed because many people, like, watching this course, I think just won't even have have even thought of that.
So that, I think, that alone is incredibly valuable for you to be able to create something that just a regular application developer can can watch and, like, come away with that much, like, information just about, like, these types that we work with.
I found that to be really interesting, but also just this idea of, like, using the right type for the thing that you're doing and trying to figure out how how small of a type you can fit the thing in with while still having enough space.
Mhmm.
Yeah.
I I learned a ton ton.
I I haven't finished any other parts of the course yet, but I learned a ton from the the schema section.
And it definitely made me think
Aaron
00:11:32 – 00:11:57
Oh, it's a long course, so that's fine.
Yeah.
That's that's good to hear because I I think one of my, like, great unfair advantages when creating this course is I'm an application developer myself.
And so those things that you just like, those things that you just mentioned about, like, oh, wait a second.
If I don't need negative numbers, I should make the column unsigned because that represents the reality more accurately.
Aaron
00:11:57 – 00:12:24
Like, all of those things that you're talking about and basically everything that I teach in this course was kind of something that I, like, had a light bulb moment while I was reading the docs or reading these books.
And I was like, wait.
Why is nobody talking about this?
Like, the frustration for me was a lot of the sequel content was either was either aimed at beginners of, like, this is how you would do a select.
The star means bring back all the columns.
Aaron
00:12:24 – 00:12:36
I'm like, yeah.
I know that part.
Or it was aimed at DBAs, and it was like, this is how you, you know, turn on bin log replication.
And I'm like, what are you talking about?
I don't I don't need to know that.
Aaron
00:12:36 – 00:12:46
Somebody else is gonna host my database.
I just wanna know how to drive it.
Yeah.
And so it's it's funny to hear you say that because that was the exact like, that is exactly what I was hoping would happen.
But but it's also just, like, people who are not writing in a language like Rust or something where you would have to declare, like, the type of an integer as being, like, a particular number of bits anyway.
Like, we just don't even think about it.
And I think, especially if you are, like, just building Rails apps, you just, like, add a column kind of using the generators, probably leave it as nullable even though like, that's the default is for it to be nullable, which I think is a bit weird.
It's really interesting.
Just, like, just the scheme a bit.
I can't wait to see see what else is there.
Aaron
00:13:23 – 00:13:26
Well, as as a PHP developer, we're in the same boat.
Collin
00:13:26 – 00:13:26
I I
Aaron
00:13:26 – 00:13:34
don't know.
Is it a string?
Well, PHP will kinda turn it into an integer for you without asking.
So, like, we're the wild, wild west as well.
Collin
00:13:34 – 00:13:59
It was really interesting to me to hear you 2 talk about that since I have generally come from languages that are more strongly typed.
And so I'm like, yeah, if it's gonna be 0 to a 100, you use, like, an 8 bit integer, you know, for that because you do have to declare that, where here you guys talk to them like, oh, yeah, when you're writing rails, like, you just that's that's not the default that you have to do that.
That's interesting.
Collin
00:14:00 – 00:14:14
So the next section of the course, I guess, we can just sort of go through an order, is indexes.
And this is a part that neither Joel or I have gotten to yet.
So maybe you can give us the overview of that.
Aaron
00:14:15 – 00:14:38
Yeah.
Totally.
And, again, we're, like, doing another section before we actually get to the section about pulling your data out.
And that's very, that's very purposeful because I think, you know, schema starts with, you gotta set yourself up for success and then indexes are very, very fundamental to how are you actually gonna query.
So like, before we even get to the querying section, we need to talk about, okay.
Aaron
00:14:38 – 00:15:01
Well, let's let's talk about how these indexes work.
And so, you know, there there are a lot of, specific MySQL things in there, but we start we start with some super broad, generic, like, computer science light.
I got my degree in accounting.
I did not get a computer science degree.
And so it's very much, it's very much computer science lite.
Aaron
00:15:01 – 00:15:24
But we talk about, like, what is an index?
Like, we know indexes are good or whatever, but what does an index actually do?
And so that's kind of the first several videos is actually like, how does what is an index?
Well, it's a separate data structure that keeps, you know, a copy of part of your data around.
How does it, like, how does it make it fast?
Aaron
00:15:24 – 00:15:54
And so I have a little diagram of a B tree that's like, instead of, you know, looking row by row, we can kinda start at the top and go to the bottom and we get there super fast.
And then what I think is really interesting, and this is a MySQL specific thing.
I'm sure other databases have something similar.
But what I think is really interesting is when we start talking about the primary key versus the secondary keys.
And so in in MySQL using inodb engine, but in in MySQL, the table is the the primary key.
Aaron
00:15:54 – 00:16:17
It's a clustered index, and so that's how your rows are actually, like, arranged.
And then every secondary index has, like, the the primary key appended to it.
So when you look something up by a secondary index, it goes to that index and looks through it, finds the primary key and phones home and goes and looks through that index.
And I just was like, this is super interesting.
I never knew any of that stuff.
Aaron
00:16:18 – 00:16:44
And so then after that, we get into more of like, okay.
Well, here's here's when you might use an index.
Here are good things to consider when you're looking at where should I put indexes.
Here are the rules about, you know, putting an index over several columns at once, and the rules are very specific.
And so we dive into a lot of that, a lot of that stuff that application developers need to know when they're trying to get, like, they're trying to get their data out.
Aaron
00:16:44 – 00:16:57
And I think most what we hit most of the time is the database becomes the bottleneck.
Like, people make fun of PHP all the time.
People make fun of Ruby sometimes because they're like, oh, Ruby's slow.
Come on.
The the bottleneck is gonna be the database.
Aaron
00:16:57 – 00:17:04
Like, with with any web application that we're building, the bottleneck is likely gonna be the database.
So that's what we try to hit in that indexing section.
Collin
00:17:05 – 00:17:17
I'm gonna ask you a question as to why we're a very junior developer and the kind of thing that maybe someone would wanna get out of your course, which is, like, can you over index?
Like, why would you not apply an index to everything?
Aaron
00:17:17 – 00:17:30
Yeah.
Yeah.
That is a that is a great question.
And one of the things it's actually one of the things I I talk about is, like, indexes are good.
You should have as many indexes as you need.
Aaron
00:17:30 – 00:18:08
And the second point is you should have as few indexes as you can get away with.
And so there's this tension between indexes are the primary way that you ensure a performant query, but they're also a copy of part of your data.
Right?
So that secondary data structure has to be maintained.
So anytime you do an insert, an update, or a delete, obviously, the primary key, which is the table, obviously, the table is modified, but then every secondary key that contains that data, which is a separate data structure, you have to go out and it has to be modified as well.
Aaron
00:18:09 – 00:18:31
And so storage, that doesn't move me.
Like storage argument doesn't move me so much.
You can you can bloat your table up with indexes.
The argument that moves me is those structures have to be maintained.
And so that can really kill your insert, update, and delete performance if you have literally 2 or 3 copies of your table in all these various indexes.
Aaron
00:18:31 – 00:18:38
So, yeah, it's, schema.
Schema is basically a science.
You look at your data and you're like, you know, how big is it?
Alright.
I'll put it in this column.
Aaron
00:18:39 – 00:18:53
Indexing is much more of an art and sometimes a dark art because you're like, I'm gonna have to make some trade offs here.
And so, yeah, you can definitely over index, and I think you could definitely under index, and that would be having no indexes at all.
I think one thing that I'm excited about getting to on the indexing section, which I'm kind of guessed is coming, based on the the schema section, is the ability to index generated columns, which I learned about for the first time.
I've not really used MySQL very much, or at least not to not recently.
And generated columns are cool.
So you can you showed in the first section of the course that you can create a generated column that is based on a JSON column, a particular query of a JSON column.
So you can say, I wanna look for the value at this key and then this key and then this key in this JSON column.
And you can either have it computed I I can't remember the technical term, but, like, on the fly as you read it, or you can have it computed in advance and put in an actual column on the table.
And I'm guessing that you're gonna get to you can index that column if it's computed in advance.
Aaron
00:20:01 – 00:20:21
You can, and you you have set me up well because generated columns are just I am just thrilled to death about generated columns.
I just find them so useful, and I I just don't think they're talked about enough, which sounds I don't know.
Maybe sounds super lame.
We don't talk about generated columns enough, but here here's the deal.
They're awesome.
Aaron
00:20:21 – 00:20:57
So, like, yes, we do talk about how to index them.
The great thing about a generated column is you just slap an index on it.
It my SQL treats it like a real column, so you just add an index to it.
So the what what a generated column is under the hood is basically you say it's very much like Excel where you would have a column that's a formula that references some other column, but they get really, really powerful when you do start to think about indexing because they allow you to reach, they allow you to reach things that would otherwise be unindexable.
So, like, let's take that JSON column, for example.
Aaron
00:20:58 – 00:21:29
In MySQL, you cannot put an index on a JSON column.
That that much I know is different in Postgres because you can throw I think it's a a gen index on a JSON column, and you get kinda, like, some magic, and it's awesome.
MySQL doesn't have that, but we love it anyway.
So what you would do in MySQL is you have a JSON column with a blob in it.
And then in a generated column you would say, I need to pull this one particular key or this one particular value up to the top level.
Aaron
00:21:30 – 00:22:09
And I want it I wanna make it a real column.
In an in our application world, we would do something like that maybe through, like, a a model observer or model callback or some sort of life cycle hook or life cycle event depending on what framework you use.
So, like, when the user, you know, when the user model is being saved, I also wanna touch this attribute as well, that sort of thing.
This is very much like that except maintained by the database at the database layer.
And so if you ever accidentally save a user without firing the callbacks or perhaps you're you're you're mucking around in table plus and you're modifying records in there, I mean, we've all done it.
Aaron
00:22:09 – 00:22:32
Right?
Those callbacks aren't gonna fire.
But using a generated column, the database is in charge of that and the database handles that, and it can't you can't mess it up.
And so it's super duper nice when you need to pull a key out to the top level, and then you can throw an index on that key in particular.
So if your JSON blob for whatever reason has, I don't know, you would in it or maybe an email key.
Aaron
00:22:32 – 00:23:02
And for whatever reason, it makes sense for that JSON blob to be that way, but you're like, man, I really need I really need that one key to be pulled to the top.
But if they ever get out of sync, I'm in big trouble because then I have an email key in the JSON blob and a different email key in this column.
That's what generated columns do.
So what you do is you basically create a new column and you tell MySQL here is the formula more or less.
Here's the formula I want you to use to keep this column updated.
Aaron
00:23:02 – 00:23:21
Then you get to tell MySQL either, and this is what you were talking about, either it's going to be a virtual column or it's going to be a stored column.
And virtual columns are calculated when they're pulled out.
And so it's just like it's just kinda like a view thing.
It's just like a macro.
It's like, okay.
Aaron
00:23:21 – 00:23:43
When you're pulling the row out, calculate this again.
Stored columns are actually, calculated when it's, inserted or updated, and they are written to the disk as a real column.
And they each have their trade offs, but it allows you then to just slap an index on it.
And the neat thing is you can actually index a virtual one.
Well, you don't even have to write it to the disc.
Aaron
00:23:43 – 00:24:06
You can index a virtual one because what MySQL is gonna do is it's gonna write that the result of that formula to that secondary structure.
And so when you then go to query and you're like, alright.
Let me find rows where this JSON blob has this key, this key, this key equal to Aaron.
My SQL looks at it and says, hang on.
I've seen, like, I've seen that formula before.
Aaron
00:24:06 – 00:24:21
In fact, I have an index that's based on that formula itself.
Even though it doesn't have a pretty name, I have an index based on that formula.
I'm just gonna go use the index.
And so MySQL is smart enough to look at that and go, I know what you're asking for.
I'm gonna use this index.
Aaron
00:24:21 – 00:24:46
And so I think generated columns are fantastic.
They're useful in more situations than just JSON.
Like, if you need to search by, let's say, a user's domain, you can have a generated column where you where you split it at the, the at sign, the email at the at sign and put the domain in its own column.
And that makes it a lot easier to search for domain because leading wildcard searches and text are hard to index.
And so you can just kind of break it up and index the whole domain.
Aaron
00:24:46 – 00:24:56
So I know that's a little bit of a fanboy about generated columns, but I think they're awesome, and I think they're super useful.
So, yeah, you should look forward to that indexing part in the in the second section there.
Collin
00:24:57 – 00:25:20
That's really interesting, and you were talking about trade offs before.
I have to imagine there are some pretty big trade offs to doing what you just said.
Right?
So, like, you probably get a a big bump in, you know, querying, but that must affect, you know, like you were saying, maintaining those multiple data structures.
Is there a big hit to, like, write performance when you use those sorts of things?
Aaron
00:25:20 – 00:25:49
You know, that is a good question, and it depends, and I don't really have any solid numbers on that.
I will say, for example, let's go back to the JSON the JSON example.
JSON columns are stored in such a way that when they reach a certain size, and usually you're shoving a lot of JSON in there.
Right?
When when they reach a certain size, the, value is stored often kind of a second location and a pointer is left in in the actual row.
Aaron
00:25:49 – 00:26:15
And so let's say then that you wanna pull you need to query on a key.
If you're querying on a key, not only does it have to inspect all of that JSON, it has to go out to that secondary location to look at that JSON even in the first place.
So in that case, using a generated column that is stored and you're like, wait.
Now I'm duplicating the data.
But if you use a generated column that is stored, you might not ever need to go grab that full heavy JSON blob.
Aaron
00:26:15 – 00:27:01
Now if you have a pretty basic let's say you have a pretty, basic column that is stored with all the other columns because it's not big, so it's not off in that secondary location, then it's basically the equivalent and you do let's say you do a virtual instead of a sword.
If you do a virtual instead of the sword, it's basically the equivalent of saying select star comma and then whatever your, you know, your virtual column is based on, whether that's, like, exploding the, we call it in PHP, we call exploding, splitting the username and the domain.
Like, it's the it's the equivalent of just adding to your select statement when it's a virtual column because it's just having to calculate that as it pulls it out.
So I don't really have any good stats.
That's kind of the way that I think about it.
Aaron
00:27:01 – 00:27:12
If it prevents me from going to access a blob off in that secondary location, I feel great about that.
And if it's just, like, adding a select for a pretty basic operation, I feel good about that too.
Speaking of going to a different location to access a blob, you mentioned in the first part of the course a technique where and this was new like, this idea was new to me.
I thought it's brilliant.
Where if you're gonna store, like, a large text field or a blob, you should consider storing it on a separate table.
And I think you mentioned that that's because if you want to, like, select if you if you're not interested in that column and you're selecting on the main table, then and you and you, like, select star, you're gonna end up making the database go and find all that text even though you might not need it.
And so it can be a lot better to have, like, this user join to go and find the the text as and when you need it.
But, like, the basic, like, default query doesn't include that.
Is that, like, what would you say are kind of the trade offs of using that technique?
I guess doing a join is its itself more expensive if you're always gonna need that text.
I don't know.
I I thought that was such an interesting technique.
Aaron
00:28:15 – 00:28:44
Yeah.
This is one that application developers, I think, specifically using frameworks like Rails, Laravel, or Django, something like that.
I think this is one that that we should probably think about a lot more.
So the basic the basic theory here is we want our rows on the disk to be compact as possible so that they are as readable as quickly as possible.
So, again, it's not about, like, saving some money to buy a smaller disk.
Aaron
00:28:44 – 00:29:06
It's about keeping all of that data super close together so that the database can grab it all kind of in in one go or as few goes as possible.
Right?
So the theory here is if you have a super wide table so you've got a table that's, I don't know, 50 columns or something.
And you might laugh, but it happens.
Right?
Aaron
00:29:06 – 00:29:26
These you're just like, hey.
Let's add, oh, let's add another thing.
Let's add another thing.
And, eventually, you've got these huge tables.
Now that may not always be bad, but when you do have a table that's kinda wide and it does have a lot of columns that you hardly ever use, especially if many of them are large.
Aaron
00:29:26 – 00:30:16
Like, if you, for whatever reason, need to store, text, which is the the official name for a huge text column, text blob, or a ton a ton of JSON, for example, then not only are you sending a bunch of data back over the wire that you never end up using, which is kind of a waste, you're also, like, going out to grab those potentially large blobs from that secondary location and sending that back as well.
But then also your rows are very spread out on the disk, and so the database is having to touch the disk more times, which is not always ideal.
So to prevent that, what you could do, you could split off, you could break your table.
Right?
So if you have a very long table that's very wide and it has a few hot columns and a few cold columns.
Aaron
00:30:16 – 00:30:45
Right?
So you've got some columns that, I don't know, you keep around for auditing purposes or to show, like, on a deep settings page.
But beyond that, you know, like, you never really need it.
You could break it up into, like, a users and a users meta or a users addendum or a user supplement, whatever you wanna call it.
And then that way your your super hot columns, that table's really narrow, really narrow, really compact on the disk, easy for the database to touch the disk as few times as possible.
Aaron
00:30:45 – 00:31:11
The trade offs.
The trade offs are real.
In your really frustrating if you are used to treating the users as a singular god object, and now there's, like, a user supplement contains most of that, like, long tail of information.
That could be frustrating.
Not only that, you you gotta kinda keep them in sync.
Aaron
00:31:11 – 00:31:47
Like, that's more of a read example.
But when you're writing, if you have, like, this, you know, god object page where they can change every setting or whatever, you gotta know which of the settings or which of the attributes go to to which table, and that can that can be annoying too.
And so it is not, I it has trade offs.
I'm glad y'all keep asking about it because there are no there are no free lunches, especially when it comes to, like, working with databases.
But if if the trade off is worth it, it's worth considering, and I think you would know if this is worth worth looking into.
Aaron
00:31:47 – 00:31:55
If you're listening to this right now and you're like, oh, yeah.
That one does have a 100 and 10 columns, doesn't it?
Yeah.
Yeah.
You might you might consider this.
Aaron
00:31:56 – 00:32:20
And this is especially this is especially true, I think, in an active record kind of world, which we both live in.
Because active record, at least in our world, in Laravel, it just issues a select star.
Like, every time, it's just issuing select star and populating the model, and we love it for that.
Like, I'm never gonna be the guy that's like, ORMs are bad.
I love ORMs.
Aaron
00:32:20 – 00:32:44
They make my life so easy.
But if you're issuing select stars and you're getting back 45 columns and you need 3 of them, okay, well, maybe we need to start thinking about, like, maybe we need to start thinking about either selecting fewer columns, which doesn't solve the disk problem, but is a best practice.
Or potentially, if it's extreme, splitting it up into kind of a secondary table.
Mhmm.
That makes so much sense.
And I guess one of the trade offs there is if you find that you're always going to that secondary table anyway, like, every time you access the users table, you're always going to the users meta table, then it's probably not worth it.
Aaron
00:32:59 – 00:33:01
You split it incorrectly.
Yeah.
Yeah.
You you whatever field it is that you're going for all the time needs to come over to the main one because
Aaron
00:33:05 – 00:33:42
And what a what a pain it would be to actually move it over to the main one.
And so you gotta know, like, you gotta be pretty sure before you do this that there's stuff that you simply don't need.
And maybe, like, maybe you start by not issuing select star in most cases.
Like, maybe you start on the application side, and you're like, I have a theory that this global scope or whatever would be best, by only selecting 5 columns.
I'm gonna start with that and see how often I change the scope to select more columns that are still in the same table, and maybe that's a good proving ground for, alright.
Aaron
00:33:42 – 00:34:01
Here's my theory.
Do I actually wanna, you know, go through the pains of splitting this table apart?
Let me let me do it on the application side first.
And I think one one point I do make about select star and not doing select star.
So if you ever hang out on Stack Overflow and you're like, I I did select star.
Aaron
00:34:01 – 00:34:17
You're gonna get shamed by all of these DBAs that are like, how dare you do select star?
You must be an application developer, aren't you?
And it's like, y'all.
Like, I understand I understand we inhabit different worlds, but let's have a little compassion here.
So not doing select star is the best practice.
Aaron
00:34:17 – 00:34:29
Right?
Only select what you need.
Absolutely.
I I cosign a 100%.
Sometimes, like, living in an active record world, you don't wanna deal with partially populated models.
Aaron
00:34:29 – 00:34:58
Like, you don't wanna deal with anemic models that you thought were gonna be fully hydrated.
And so if your ORM does not have protection against partially populated models, I would say do select star always because the risks aren't worth it.
It's like if you run a select first name last name to populate a user model and then you access email Mhmm.
And you think email is null, but really, email is just in the database.
You didn't you didn't populate your model with it because you thought you were, like, being a good DBA.
Aaron
00:34:59 – 00:35:17
That that risk is way too high to me.
So Laravel has recently introduced, partially hydrated model protection, basically.
I talked to some friends at PlanetScale.
I think Ruby or Rails has had this for a while, but select select only what you need.
Great idea.
Aaron
00:35:18 – 00:35:24
Accidentally dealing with a hot partially hydrated model sounds like it could lead to disaster.
So that's that's my caveat.
Right.
And I'm not sure how Rails handles that, actually.
Because I guess it could either try and just, like, go and fetch that thing for you as and when you require it, or it could raise an error and be like, you should have selected this thing.
Or like you said, it could just return nil.
Exactly.
Aaron
00:35:41 – 00:35:57
That's what Laravel did for a long time because we have so we have so many, like, magic methods that are like, oh, you accessed first name.
Did we pull it from the database?
Is it a method on the model actually?
Or is it this, you know, some other thing?
And if it's none of those, it just returns null.
Aaron
00:35:57 – 00:36:25
And You're like, oh, that's kinda nice until you realize, wait a second, email is definitely there.
I just was being clever and didn't select it and nobody told me.
And so now I'm treating the user as if they're a guest user because I don't have email populated, which, who cares about that really?
But now when you start talking about is admin or is blocked or billing details, and now you're running them through these flows that they should not be in.
So, yeah, be careful out
there.
But as as long as your ORM handles it, you mentioned, and I I think this is a great idea, if you're thinking of separating like, splitting a table where you've got, like, maybe some columns that you don't use all the time in a in a separate table, Doing that first by just taking your main table, like your main queries to the main table, and adding a select statement where you're selecting specific columns, that is a great way to, like, check.
Do I actually only need 5 of these columns, or do I really depend on the other 10 all the time?
So that that makes a lot of sense.
You also mentioned, and, I think this is just a great example.
You mentioned auditing.
And I can think of a ton of examples where there are columns that we put on database tables that are, like, write only.
We write them there because they're, like, for auditing, but we just never read them ever.
And sometimes those columns are things like, this is the entire JSON response that I got from sending out this request.
Aaron
00:37:31 – 00:37:31
Yeah.
And I'll just write it there for late.
So, like, so that's so it's there.
Aaron
00:37:34 – 00:37:35
Absolutely.
Yeah.
And that is exactly the kind of thing that, you know, if you're never gonna read it, it should just be on a different table.
And I think that is just such a great piece of, piece of advice to keep in mind.
Aaron
00:37:47 – 00:38:14
Yeah.
That's a great example because I think in, in some cases, people would be like, I cannot believe you're storing all of that in the database, and you don't even know if you're gonna need it.
And it's like, well, let me tell you a story about when I needed something and it wasn't there.
And so, yeah, we write it all to the database.
And, again, being an application developer myself, I feel like I have a lot more a lot more, empathy for what we actually need to do.
Aaron
00:38:14 – 00:38:31
Like, sometimes I feel like the database tail can wag the dog, and you're like, listen.
I'm trying to write an application.
I'm trying I'm trying to, you know, one, keep my job or build a business or whatever, but go home at the end of the day and be happy.
And, like, I'm sorry.
I'm storing a bunch of data in the database.
Aaron
00:38:31 – 00:38:38
That's what I'm gonna do.
And that's a great example of, like, yeah, dump it in the database, but how can we do that?
Maybe a little bit more efficiently.
Collin
00:38:39 – 00:39:45
There were a few things you mentioned that got me wanting to ask this question, which is, do you think there are a lot of places where when we are using ORMs of different flavors that they kind of guide you to doing things that are maybe not so ideal.
And someone, for example, just using rails or Laravel or whatever in the way it is meant to be used ends up with some sort of suboptimal results.
And I can give you an example of this actually from the iOS world, which is we have an ORM there called core data and a feature it had in the modeler, which you could sort of set up, what would you say, like, subclasses.
You could say, like, this thing I'm setting up in this visual modeler has a parent of this.
Like, the its parent entity is this, And that ended up being something people did not recommend you use, like, it was a feature you should stay away from because the implementation of it was when you were talking about very wide tables.
Collin
00:39:46 – 00:40:03
The implementation of it was that it just made that all one table.
So anything that was a child entity, this just ended up being a giant table.
And so that's an entire feature where it makes sense when you're using it, but actually ended up being kind of not a good practice.
Do you think there's a lot of that?
Aaron
00:40:03 – 00:40:23
I think there is definitely more of that than we talk about.
I think there are 2 there are 2, dominant voices online.
And by that, I mean, Twitter Hacker News and Stack Overflow.
Like, that's what I mean when I say online because that's where I hang out.
And I think the dominant voices are, ORMs are really bad.
Aaron
00:40:23 – 00:40:48
How dare you call yourself a developer and use an ORM?
You should be writing SQL by hand like our forefathers did.
And the other side is I don't know what database I'm using, but I have this great ORM, and I just, like, I just use the ORM.
And neither is correct in my opinion.
I think ORMs should be treated for what they are, and that's a tool.
Aaron
00:40:48 – 00:41:23
And it's our responsibility as developers to know how to wield this tool such that we save time and we're still we're still doing things correctly.
So I will say, and I have said I said in the course, I'm never gonna tell you that if you don't write SQL, you're not a real developer because I think, 1, that's silly, and 2, that's harmful.
What I will tell you is you should learn how to drive your ORM like an expert.
Like, you should be an expert user of your ORM.
And at the end of the day, unfortunately, you're still responsible for the SQL that it generates.
Aaron
00:41:24 – 00:41:32
I don't really care how you get there.
If you wanna write it by hand, that's great.
I don't wanna do that.
If you wanna use your ORM, that's awesome.
That's what I do.
Aaron
00:41:32 – 00:42:00
But you're still responsible for what it generates.
So what that can look like, I think one broad category of examples that includes a lot of, concrete examples is something called index obfuscation.
So when you put let's take, first name.
When you put an index on first name, the column first name, what MySQL does is it takes all the the values from first name and it arranges them in a certain way and it puts it over in that secondary structure.
Awesome.
Aaron
00:42:00 – 00:42:19
Very basic.
Right?
What could end up happening is you could be using your ORM to query against first name, and for some reason your ORM may throw a function on it that says lower and then in parenthesis first name.
Right?
So what the ORM has done, and Laravel's doesn't do this.
Aaron
00:42:19 – 00:42:43
I don't think Rails does this, but this is this is conceivable.
What the ORM has done is said, like, I'm gonna perform a case insensitive search for you by lowering the first name column and lowering your search input, and that way I'm being very helpful.
That way we'll make sure we get all the values that you need.
And what that means is MySQL is no longer able to use that index on first name.
Right?
Aaron
00:42:43 – 00:42:56
So you thought, okay.
People are gonna search by first name.
I'll put an index on it because I'm a very smart boy.
But now your ORM has totally hosed you because you don't have an index on lower first name.
You have an index on first name.
Aaron
00:42:56 – 00:43:09
And so my SQL looks at that and says, nope.
I'm gonna scan the entire table looking for these first names.
So that is an example of an index obfuscation.
Right?
And that is something that an ORM can do.
Aaron
00:43:09 – 00:43:37
A very real example, and this is something that Laravel does.
And here's here's where I will put my foot down and say, this is not Laravel's fault.
This one that I'm about to talk about is not Laravel's fault.
It is a convenient function for the developer, and you are responsible for investigating, is this writing basically the sequel that I want?
So where this can happen for real in Laravel is you can say, let's say, like, the user model, and then you can say where year.
Aaron
00:43:38 – 00:43:52
So there's a there's a method called where year.
And then inside of that, you can pass, like, created at.
So let's say I wanna find users that were created at created in 2023.
Right?
So I say user where year created at is equal to 2023.
Aaron
00:43:52 – 00:44:10
It's like, that was that was like a really nice developer experience.
I just got to write the words where year passed in my column and what year I'm looking for.
That's really nice.
What that does under the hood, unfortunately, is it wraps the created at column in a function called year.
I mean, it makes enough sense.
Aaron
00:44:10 – 00:44:27
Right?
Now you're created at column is obfuscated.
So if you had an index on created at, it's not gonna be used.
Like, you're totally hosed again.
What would be better here is if you as the developer look at that and say, well, that looks like index obfuscation to me.
Aaron
00:44:28 – 00:45:04
What I'm gonna do instead of saying where year, I'm gonna say where created at is between, and I'm then I'm gonna pass the 1st part of the year and the last part of the year.
Because now you're leaving your created at column untouched by a function and you're passing in 2 values, and MySQL can use that b tree to, you know, navigate and then scan down at the bottom to do a range scan.
And so that's one example where the ORM and, again, it's not the ORM's fault.
Like, the ORM messed you up there and you probably didn't know it, but it might not have messed you.
If you didn't have an index uncreated at, then you're in the you're in the same boat.
Aaron
00:45:04 – 00:45:25
And that's where I say, you need to be an expert driver of the ORM.
And that's what this whole course is about, not knowing, like, how do I run MySQL, but how do how can I be the best MySQL user ever?
And I'll let, you know, I'll let somebody else run it.
I'll I'll let, you know, the DevOps team or PlanetScale or whoever.
I'll let somebody else run it.
Aaron
00:45:25 – 00:45:44
But man, am I gonna be an expert driver?
So there are places ORMs can hose you.
I think on the whole, they're far, far better than trying to write raw SQL.
I think in the worst case, an ORM may hurt your performance.
I think in the worst case of writing raw SQL, you give away the keys to the kingdom, and it's just game over.
Aaron
00:45:44 – 00:45:47
So I I would take that trade off any day.
Collin
00:45:47 – 00:46:09
Yeah.
My takeaway hearing you talk about using ORMs and trade offs and those things is that I've used several different ORMs over the years, and they will, at different times, say, like, don't even think about this as a database.
Like, a database is one potential back end you could have.
Who even knows?
It could be anything.
Collin
00:46:10 – 00:46:31
You just think about modeling your data.
Like, that's that's what core data has always said, and I think what a lot of people found out with that experience was that you do actually kind of need to understand the database and what it's doing and have some, like, intuitive sense for that.
It seems like that might be sort of a point of your course is developing that sense.
Aaron
00:46:32 – 00:47:26
Yeah.
When you were when you were saying that, I was shaking my head so hard because I feel like that's a that's a disservice because it's gonna be such a I don't know.
I I don't know if you'd call this a leaky abstraction or I I don't know what you would call it, but to pretend that there's not a database underneath it, I think, is disingenuous at worse and naive at best because, yes, the ORMs are incredibly powerful and, you know, each of our communities is blessed with incredibly good ORMs, but there's still a database underneath.
Right?
And to pretend that there's not, it doesn't make a lot of sense to me because if I'm querying a set of files on disk, the the way that even that is implemented is just entirely different than querying an actual database.
Aaron
00:47:26 – 00:47:45
And so if if somebody's like, yeah.
Well, you can use this ORM and feed it a folder of JSON documents, and it works just the same.
It's like, that's a really neat trick.
I don't I don't think that that one is ever gonna happen to be like, well, I was gonna use my SQL, and I'm gonna use a folder of JSON documents.
Like, I was gonna use my SQL.
Aaron
00:47:45 – 00:47:50
I'm gonna use Postgres.
Sure.
I was gonna use MySQL.
I'm gonna use SQLite.
Sure.
Aaron
00:47:50 – 00:48:26
Now you get to go learn the differences about Postgres and drive your ORM in such a way that you know you're on a Postgres track or you're on a MySQL track.
But to pretend that there's nothing underneath, I think, is is silly.
And another example of knowing why or or why you need to know what's underneath is let's take polymorphic relationships.
Right?
I think you were talking about child tables earlier, very, very similar polymorphic relationships, at least, at least in, in Laravel, there's this table kind of that, that says, like, okay, this is linked to this model.
Aaron
00:48:26 – 00:48:44
Right?
So you might have a comment that can be linked to a user or an image or a video or whatever.
And somewhere in the database, you have to store what that other model is.
Right?
So what you need to know is, okay, this this comments table is probably gonna get really, really big.
Aaron
00:48:44 – 00:49:06
We're gonna have a ton of comments.
We're gonna have millions of comments, probably.
What's the most compact way that I can do it?
If you run a polymorphic relationship migration out of the box in in Laravel, it creates a a string column where the the class is stored.
So you can look in the database and be like, oh, this comment belongs to an app slash models slash image class.
Aaron
00:49:06 – 00:49:13
Right?
And you're like, oh, that's nice.
Like, that makes a lot of sense.
That's that's really helpful for me to see.
Hang on a second, though.
Aaron
00:49:13 – 00:49:50
If you're gonna have millions and millions of these things and you're gonna be putting a compound index over, like, the ID and the class type, well, I don't need to store the fully qualified class name.
I don't need to store, you know, 36 characters of text just to know that it's a it's an image or a user or whatever.
And so that's again where you can you can remember what we learned in the schema section.
You You want it to be as compact as possible, and you can say, I'm gonna take over this migration.
And instead of storing, you know, 30 to 50 characters of a fully qualified class name, I'm just gonna put a single number in there.
Aaron
00:49:50 – 00:50:09
I'm gonna do the mapping.
I'm gonna do the mapping over an application land where one equals one equals image.
And that's another example of, like, hey.
The the ORM is here to help you, but once you reach a certain scale, if you have a 1,000 records in there, it doesn't super matter.
I'm gonna be honest.
Aaron
00:50:09 – 00:50:35
But once you reach a certain scale, like, you're the developer.
You know your data.
You're in charge here.
And so I think that's another time when knowing what is under the hood, like, knowing what is backing your data, in this case, MySQL, you can look at that and say, wait, there are 4 distinct values across 10,000,000 rows.
Maybe I should compress that down a little bit, use an enum or use a tiny end or something like that.
Collin
00:50:36 – 00:51:03
Yeah.
Absolutely.
My experience has definitely been that whenever something says it can abstract the database or whatever, and you don't need to worry about what's under it that except for, like you said, with a 1,000 rows or something, except with the absolutely most trivial case.
You immediately need to know, and now you're debugging it.
But maybe you have this weird layer in between you, and it's actually harder than it ever would have been.
Collin
00:51:03 – 00:51:26
I think, fortunately, active record in particular, I don't know what Joel thinks about this.
I think it has a pretty good balance of that.
It's not abstracting it so far.
Like, there is an acknowledgment that these map 2 tables in a database and rows in the database, unlike some other things where you're not even supposed to talk about that.
Like, that's, like, a dirty secret that there's a that there's a database anywhere.
Yeah.
I I think it's it's not bad.
You can definitely I mean, like, the the polymorphic relationships, for example, are the same.
They create a string, and there's just no reason really to to use a string.
Also, I think there are some patterns in active record like, single table inheritance that can be pretty dangerous.
Like, not dangerous, but, like, you often end up with a table that is just really not ideal and you end up having to convert that to a polymorphic relationship instead of a single table inheritance type structure anyway.
And doing that can be a lot of work.
And I think there's probably it might be helpful if the documentation was a bit more.
Like, this is probably not a good idea anyway.
It's kind of not a very good pattern.
So one of the other things that you work on besides developer education at PlanetScale is a tool called Hammerstone, and it's kind of related to database querying and stuff because Hammerstone is kind of well, you describe it.
Aaron
00:52:37 – 00:52:38
Yeah.
That's part of our problem.
It basically makes it's like an API for, like, creating a really complex database based query.
Right?
So you can have, like, I don't even know how where to start.
Aaron
00:52:50 – 00:53:05
See, this is the problem.
You have perfectly you have perfectly illustrated the problem that we're having.
So I work at PlanetScale full time for, for a long, long time now.
I've been working on this this side project.
And so it's me and my partner, Colleen.
Aaron
00:53:06 – 00:53:27
And so what we have is a, it is a drop in component.
So we have one for Laravel and one for Rails.
So on y'all side, it's a gym.
On our side, it's just called a package, that you can install into your application, and then that gives you the ability to give your users a visual query builder.
Right?
Aaron
00:53:27 – 00:53:50
So this came out of I I before I was working at Planet scale a while back, I was working at a, like, a local services company.
It was a property tax company.
And I kept having employees come to me and be like, hey.
Can you run this report?
Like, can you show me all the houses in Dallas that we haven't protested yet and the value is under $200,000?
Aaron
00:53:51 – 00:53:59
And I'm like, yeah.
I can run that report.
And then they would come back and be like, hey.
Can you run it for Collin County where the value's under 300,000 and we haven't protested it yet?
I'm like, yeah.
Aaron
00:53:59 – 00:54:19
I can run that report.
And, of course, like, the 50th time that that happens, I'm like, y'all, how many reports are there?
Like, are we ever gonna reach the end?
And the answer is no.
And so what I did was I built out a a very flexible a very flexible query builder that keeps the developer in control.
Aaron
00:54:19 – 00:54:31
So, like, our data models, we talked about this a little bit at the beginning.
Our data models are, how can I say this nicely?
Quirky.
Our data models are lovable.
Our data models have a little bit of patching here and there.
Aaron
00:54:31 – 00:55:04
Right?
And we as the developer know that, like, our data model is not as clean as it could be because we've been in business for 10 years and, like, we've added things and removed things.
And so you as the developer get to say, like, here are all the conditions that are available.
Like, this one's a text, this one's a number, this one's a Boolean, and then you get to, like, you get to paper over your data model.
So you would say, like, this is a Boolean condition, and in the database, it is, you know, a Boolean, But there are also nulls, and don't expose that to the user.
Aaron
00:55:04 – 00:55:25
Just treat the nulls as true or treat the nulls as false.
So then your user can say, like, show me users that are active.
And, like, under the hood, it might be where it is active as 1 or active as null for some reason, but they don't need to know that.
And that's a big problem with these query builders that just sit directly on top of tables.
It's like, okay.
Aaron
00:55:26 – 00:55:37
I see, like, I see all these columns.
What what is is archived?
What is is deleted?
What is is active?
You have to really be intimately familiar with this data model.
Aaron
00:55:38 – 00:56:09
And so our thesis, our hypothesis with Hammerstone is almost every company needs to provide this kind of data to their internal users.
Right?
Almost everyone has a support team or admins or a sales team or marketing team that are coming to the developers and are like, hey.
Can you run that report again?
And our thesis has been, we will do all of we will we will eat all of that glass on your behalf and give you a final package that you can just drop in, and we have front end components that match it.
Aaron
00:56:09 – 00:56:22
And we will hand you back an active record query that you can either run or paginate or you can do whatever you want with it.
We don't run your queries.
You do whatever you want with it.
And that has been our thesis.
Mhmm.
So you basically have a language of describing your database schema, like, and converting that into a form, essentially, like the the the language for the the query builder.
And then you have these front end components that allow users to go to your web app and build up queries like this field begins with this value or is included in this date range or whatever the conditions are.
And that is something that you provide for Laravel and Rails.
Is that right?
Aaron
00:56:59 – 00:57:08
That's correct.
Yep.
So that is that is an accurate, summation.
We give the developer the ability to say, alright.
You wanna build a user's filter?
Aaron
00:57:08 – 00:57:23
Great.
Tell me all the conditions that could go in a user's filter, and then we, as Hammerstone, will present that on the front end and allow your end users to mix and match any way that they want.
You can add ores.
They can add groups.
They can do whatever they want.
Aaron
00:57:23 – 00:58:05
And then what our front end component emits is something that our back end component understands.
And so we take that that, JSON from the front end, and then we build a query on based on that JSON from the front end.
And importantly, we sit on top of active record implementations.
So we never write we never write SQL ourselves, but we sit on top of those active record implementations, and we will take all of their user input and the way that you configured these conditions.
So, like, if you said, hey, I have a date column, but for an unfortunate reason, all of my times are stored in CST, but I want you to query it as if it's UTC.
Aaron
00:58:05 – 00:58:18
Like, we got it.
We'll handle it.
We'll handle that conversion.
Your users don't even have to think about that.
So that's kind of like that's kind of our value proposition is we give your users control without you actually giving up control.
This is a massive time saver as well.
Anyone who's tried to build anything like this knows how complicated it can be.
And it's been it's been really interesting hearing you talk about it on the podcast as well.
You have a podcast to go with this, and it's, like, following the product development and all the interesting stories that go with that.
So I've been enjoying following that.
Aaron
00:58:41 – 00:58:43
No.
Thanks.
I'm I'm I'm glad to hear that.
Collin
00:58:43 – 00:59:02
You know what this reminds me of a lot is the when you're creating a, like, a smart list on macOS, any of those sort of UIs, there's a class for it, I think it's called.
Like in Mail?
Yeah.
Like in Mail or music or iTunes or whatever.
There's a class for it, I think, called NS predicate editor.
Collin
00:59:02 – 00:59:07
It looks it's it looks very similar to this.
I wondered if you had any inspiration from that kinda
Aaron
00:59:08 – 00:59:34
UI.
I have had a ton of inspiration from many of these type of builders.
I don't spend too much time in the iOS dev world, but, yeah, that this like, it it's been really hard for us trying to explain all of this, and we can talk about where we're gonna go with this in the future.
But one of the things that we've been trying to tell people is, like, yes.
You can use this to filter models on an index view.
Aaron
00:59:34 – 00:59:50
Like, that is the that is the least interesting and least sexy version of what we offer.
Like, you go to the user's page and you can say, show me users named Aaron that the state is equal to Texas.
Go.
And it's like, here are all those users.
Okay.
Aaron
00:59:50 – 01:00:22
Cool.
Everybody needs that.
I think beyond that, what's interesting is creating smart lists or email lists or, running recurring exports.
So, like, show me users that have signed up in the in the past 7 days, so you can, like, make it relative, in the past 7 days and their project count is 0 and their trial, you know, whatever is coming or their, end date is coming up.
And just send me an email of those people every morning because I know that those are my those are my churn, at risk of churning customers.
Aaron
01:00:22 – 01:00:58
And so you can take these visual queries and, like, store them, not the actual SQL, just the user's intention, store that places and use that to drive behavior elsewhere.
And I think that's where it gets really interesting, but we've just had a really, really hard time.
There's there's many reasons, but we've had a really hard time getting that value communicated.
And so Colleen and I, Colleen and I, who who's the partner, my cofounder on this, joined, TinySeed, which is a a bootstrapped kinda like VC.
They're not they're not bootstrapped.
Aaron
01:00:58 – 01:01:26
They back bootstrapped companies.
And so Colleen is now full time on it, and I spend my mornings, My I get up, come out here into the shed early and then work at PlanetScale and then work at night after the kids go to bed.
And so we're just, like, desperately iterating to try to find product market fit.
And it's, like, it's hard, of course, because it's, like, we have all this stuff that we think is valuable, and people are just saying, yeah.
That, like, that seems hard to integrate.
Aaron
01:01:26 – 01:01:43
Like, no.
We've we've made it super easy.
So hitting hitting the dev side, I think, has been a little bit of a challenge for us.
And so we're considering hitting the the business side.
Still speaking to developers, but making it less of a thing you integrate into your application and more of a thing that we host.
Aaron
01:01:43 – 01:02:06
And so that's kind of where we're trying to figure out, like, alright.
I think this problem is correct.
I think our thesis is correct.
I think our solution might be bad or or suboptimal.
And so, yeah, if you if you wanna hear an excruciating detail, hammerstone.dev, there's a podcast there, but Colleen and I talk every week about how to figure this kind of stuff out.
Collin
01:02:06 – 01:02:10
Maybe I'm crazy.
This made sense to me as soon as I saw it why you would want this, but,
Aaron
01:02:11 – 01:02:12
Well, that makes me feel good.
Collin
01:02:12 – 01:02:20
But I I definitely yeah.
But I definitely understand that, you know, explaining things to people who are then going to give you money can be very difficult.
Aaron
01:02:20 – 01:02:31
Paid, plugin or or packages or gems.
Libraries.
Yeah.
Plugin or or packages or gems.
Libraries.
Aaron
01:02:31 – 01:02:36
Yeah.
Paid libraries.
There's just not a big market for it.
Like, everybody's like, yeah.
Mike Purim has done it with Sidekick, and you're like, yes.
Aaron
01:02:36 – 01:03:11
Tell me who else has done it.
And so I think one of the things that Colleen and I have been talking about is, like, I don't think I have the energy to try to change the world in that regard.
Like, I would love it would be wonderful to me if more developers were able to spend a bunch of time creating a package and instead of open sourcing it and hoping that they get to do speak at a conference one day, they could just sell it and get paid for it.
And we had this thriving community of, like, micro entrepreneurs selling code.
I would love that for be for that to be the case.
Aaron
01:03:11 – 01:03:35
I do not have the energy or the money to, like, see that change, like, be the change there.
I would love to change the world in other ways.
Raising my children is a primary one.
I think I may just wanna sell this as a sass instead of try to change the world and make code a viable thing to sell.
And so that's kind of the hard balance is, like, there this isn't really a common thing in the market.
Aaron
01:03:35 – 01:04:00
And when you start selling to developers, they get afraid that, like, oh, well, I need to try it, and I'm gonna have to eject once I'm 95% of the way there, and it's not gonna solve my edge cases.
Or this seems like a fun problem, and I get I get paid to work, and so I'm just gonna spend, you know, 3 months implementing my own version.
And so we've had success talking to business owners who are developers that are like, yeah.
I'm a business owner.
My time is money.
Aaron
01:04:00 – 01:04:15
I'm just gonna do it.
But talking to in house developers, it's been more the case of like, I feel like this could be difficult and I'm just gonna burn a bunch of time on it anyway.
So lots lots of things that lots of people have learned before, but we're learning them for the first time.
It's it's incredible, isn't it?
Like, a developer has the at at at the average company, a developer has the autonomy to be able to spend, like, a few days working on something, but they don't have the autonomy to spend, like, a few 100 quid.
Like, a few $100, like, to buy something that would save them several weeks work.
And and, like, even even if it turns out to not save them several weeks work, did it save them 2 hours researching something else?
Like, did they learn something from it?
I think it's a real shame that that is the case.
And like you said, I I understand you don't wanna, like, pick that battle.
Aaron
01:04:53 – 01:04:54
Exactly.
Do do you find though that it is a bit different between the kind of Ruby and PHP communities?
Aaron
01:05:07 – 01:05:16
Community is a lot more gotta be careful.
The Ruby community is a lot more mature.
I mean, y'all have Shopify.
You have GitHub.
You have Heroku.
Aaron
01:05:16 – 01:05:40
And besides being great companies, they're huge companies.
Right?
And they they run cover for a lot of other smaller companies.
So, like, when people started companies, I don't know, 10 years ago, 5 years ago, they could look at Shopify, GitHub, and Heroku and be like, oh, we'll start with Rails because look, these are look at all these $1,000,000,000 companies.
And so that trickle down effect is very real.
Aaron
01:05:41 – 01:06:01
It's also very real in that there are a lot of highly paid Ruby developers that either work at these huge $1,000,000,000 companies, 1,000,000,000 plus dollar companies, or have matriculated out and work at other mature Ruby shops.
And so I think we have found the willingness to pay to be quite a bit higher on the Ruby side.
That's interesting.
Honestly, thought it would be the other way around.
Aaron
01:06:04 – 01:06:44
Well, maybe there's no willingness to pay on either side because it's certainly it's certainly not there on on the PHP side.
I think PHP has a long history of, being in reality or in derision, a script kitty community.
And I think that has been a hard thing to shake, but I think one thing that has has stuck around is there are a lot of hobbyists.
And the, like, the long history of PHP being heavily rooted in WordPress for a long time, We've now, I think, as a community moved on, and Laravel exists and runs cover for for a lot of these other businesses.
Because we can say, like, Laravel is very successful.
Aaron
01:06:44 – 01:07:05
Look at all these SaaS products that Laravel has built, but we don't have we don't have any $1,000,000,000 companies that are publicly, championing Laravel.
I am sure that there are $1,000,000,000 companies that use it, but they're not $1,000,000,000 companies that are out here, like, promoting the, you know, the Laravel Foundation.
Like, y'all just started the Rails Foundation.
You know?
But even $1,000,000,000 companies that people developers who work at $1,000,000,000 companies on projects at $1,000,000,000 companies, like, I mean, having worked at like, in a $1,000,000,000 company, probably I don't know how how valuable Shopify is or was when I was there, but it would be so much easier for me to justify spending a month building Hammerstone in house than it would be to, like, to get it approved by security and legal and, like, authorize the payments.
And you have to have, like, probably a bunch of, like, special contracts signed and agreed and everything, like, just to use a piece of software.
It's it's it would be impossible to to just, like, pull out a credit card and use this thing.
Collin
01:07:49 – 01:07:50
Yeah.
And that that's kind of amazing.
And I think yeah.
I don't know.
I you you've got, you've got a challenge there to, like, try and sell something like this.
Aaron
01:08:00 – 01:08:01
Yep.
We sure do.
I I really wish you the best because it would be amazing if people could, like, build really high quality software libraries and sell them if there's, like, a market to actually do that.
And I think it makes a lot of sense.
Like and we would have a better, like, a better quality of life if if we could if we had that kind of community.
Aaron
01:08:23 – 01:09:13
I think the more we can have, like, weirdly, this is a problem that I super care about.
Like, this report building, query building, filter building, I think it is 1, fascinating, 2, really valuable, and 3, like, a deep, deep problem.
And most people will probably, as they should stop at, like, 60 or 70% because they got to get back to, like, solving business needs.
And I think the more that we could have, nerds like me worried about one specific thing, the better the web would be.
But instead, we're all reinventing the wheels over and over and over because there's either, like, a culture of open source, which is extremely valuable, but people get burned out all the time because there's no money in it.
Aaron
01:09:13 – 01:09:30
Or people just reinvent stuff in house because there's no, like, paid supported licensed legal way to buy code like libraries.
And so I agree.
I think it would be better long term.
I don't know how to change that culture.
If if it was a culture of if these companies wanted to use your library, they would typically send you several $100 a year, then that would also be great.
Right?
You could make it open source.
Yeah.
But but that culture definitely doesn't exist.
And, yeah, that's it's sad.
Aaron
01:09:47 – 01:09:53
Yeah.
Which which battles do you have energy for?
And I think this is this is unfortunately not the one I have energy for.
Collin
01:09:53 – 01:10:07
Yeah.
Well, that's a lot of information.
So we've been over a lot.
I think there's probably a good place to maybe start to wrap it up.
So, Aaron, where can people find you?
Collin
01:10:07 – 01:10:12
What other projects didn't we cover that the people might be interested in?
Go.
Aaron
01:10:13 – 01:10:22
Yeah.
Thanks for, thanks for listening to me.
I mean, that's that's why you had me on, but I did drone on about databases for quite some time.
So thank you.
Thank you for listening to all of that.
Aaron
01:10:23 – 01:10:46
I think first and foremost, come find the course at planetscale.com/courses should take you there.
Planetscale.com/courses.
So that's where you can find MySQL for developers.
If you are looking for a place to put your database and you don't wanna host it yourself because you don't wanna be a DBA, that's what we do at Planet Scale.
So we host, we host MySQL databases.
Aaron
01:10:46 – 01:11:07
We provide a couple, layers on top of it to to make your life a lot easier.
But the core of the thing is we host databases and we're really good at it.
And I can say that because I'm on the education team.
All the big brains work over on the product and I just get to I just get to make videos.
So come to planetscale.com, and then you can find me on Twitter.
Aaron
01:11:08 – 01:11:23
My username is Aaron, a a r o n d Francis.
I'm on Twitter basically all the time.
Don't tell my boss, but I'm on Twitter basically all the time.
So come, come hang out on Twitter.
And if you wanna listen to me and Colleen's podcast, that's at hammerstone.dev.
Aaron
01:11:24 – 01:11:28
So that's a lot of links.
I think that should be all of them though.
Collin
01:11:28 – 01:11:43
Yeah.
Well, fabulous.
Thank you for being on the show.
I think there's a lot of really amazing information here, and I am definitely going to go finish your course and encourage other people to do the same.
And, yeah, thank you so much for for being here.
Aaron
01:11:43 – 01:11:49
Yeah.
Thanks for having me.
This was a lot of fun.
I I really appreciate you guys giving me the the time to drone on and on.
This was a joy.
Aaron
01:11:49 – 01:11:50
Thank
Collin
01:11:50 – 01:12:00
you.
Awesome.
Well, we will be back next week.
And in the meantime, thanks for listening to the show.
If you enjoy it, please remember to tell your friends, like, subscribe.
Collin
01:12:03 – 01:12:10
That's YouTube where you hit the bell, but do all of those things.
Hit the star in overcast, and, we'll see you next week.