Back to chats Eric Meyer and Brian Kardell chat about the history and evolution of polyfills and somehow related efforts in the wake of recent events surrounding the transfer of ownership of the polyfill.io domain

0:00

Transcription

  • Eric Meyer: Hello, I'm Eric Meyer. I'm a Developer Advocate at Igalia.
  • Brian Kardell: And I am Brian Kardell and I'm also a Developer Advocate at Igalia.
  • Eric Meyer: And this week we're talking about polyfills. So I'm just going to do a very brief historical note. Polyfill for those who aren't aware is, I guess a UK term for spackle or something that you use to fill in cracks or gaps, and then it hardens. And so that's why that term ended up being used to refer to little bits of JavaScript that we write to make up for the fact that one of the browsers we have to support doesn't yet support a CSS thing or a JavaScript thing, or even an HTML thing.
  • Brian Kardell: I think this is an interesting thing that happens a lot of times in... Not necessarily in standards, but in libraries and language basically around computers, is that, a thing is developing, an idea is developing, there's some commonality developing to it, but that doesn't have a name or it goes by a whole bunch of names, and then we give it a name and we rally around that name. And then the fact that we're all talking about the same thing also then helps us move forward in other ways because we can understand what we're talking about and think about it more concretely. So Remy Sharp introduce that word, but didn't invent polyfills. He just gave them a really good name and an analogy that makes a lot of sense in the UK and not much in any other place. But it's okay because the word works, in my opinion. I guess, people have different sense of whether that's okay or not, whether it needs the metaphor. But yeah, I like it. I think this also has happened with, for example, I don't know if you know the famous Gang of Four, Design Patterns book. Most of the patterns in there, it's not like those people invented those patterns necessarily. It's that they identified and named them and they said, 'This is a singleton.' And we've been doing singleton's already at that point for a couple of decades maybe. But we didn't have a name for it and we didn't have a common... This is the pattern.
  • Eric Meyer: Or the invention of the term Ajax too.
  • Brian Kardell: Yeah, exactly.
  • Eric Meyer: HTTP request stuff. Or for that matter, responsive web design.
  • Brian Kardell: Exactly. Yeah.
  • Eric Meyer: The concept that was emerging. I would argue that Ethan did a little more work to tie the pieces together, but it's not that he necessarily invented out of thin air, the whole concept of responsive web design, but he did give it a very handy name and show in a very compelling way, how this should work.
  • Brian Kardell: Amazing demo that really sold you. Yeah.
  • Eric Meyer: Yep. So that's how we get the name polyfill. So if you didn't know that, great. Now you do.
  • Brian Kardell: Very quickly, we got two ideas like modernizer and basically developers made it possible to move forward because the IE was stuck and they had most of the market share. So there was this real tension between, hey, there's this potentially really interesting cool stuff, but practically speaking, there's no real excuse for you to actually use it. So it was very chicken and egg problem, and polyfills made it feasible for you to consider exploring some of that stuff.
  • Eric Meyer: Yeah. Dean Edwards, IE7 was in many ways groundbreaking in this area, where he wrote a bunch of effectively CSS support for IE6 at a time when there was never going to be another Internet Explorer. Like you said, IE, Internet Explorer was just stuck. It wasn't moving. And IE6 was supposedly the last ever version of Internet Explorer, and people wanted to use stuff that IE6 did support. So Dean called it IE7, as a... I don't know, a little bit of an elbow. And then a few years later, Microsoft said, 'Oh yeah, just kidding. We're going to do more Internet Explorers.'
  • Brian Kardell: Yeah.
  • Eric Meyer: But yeah, it was, I can't remember, it's been too long now. I can't remember what IE7 added support for.
  • Brian Kardell: Oh, it was almost nothing. Yeah, it was very, very small additions. It was just barely moving us forward. They were important mainly in that they restarted things. I mean, it wasn't a significant catch-up. I can't remember what the specific ads were either, but they were really small.
  • Eric Meyer: Some of them, but it was, in a lot of ways it helped people move forward with CSS. I mean, it was really what it was mostly focused on. But yeah, so I mean, there's this long history of hacking our way around problems. Which is, that's computer engineering since dawn of time pretty much. But yeah, we have-
  • Brian Kardell: Well, there was a lot of things happening at that same time, which is interesting. Another thing that seems to happen in history is a lot of early ideas and things are crashing together. And so there were a bunch of people who started building a lot of polyfills for the early stuff because there was a lot to catch up on. I think I wrote a piece one time where I just literally went through and iterated all of the APIs and put them into a giant green and red table. And the trouble is that the gaps then, because it stayed stagnant, the gaps kept growing and growing and growing. And I wrote this piece one time that showed that there were like 1500 in maybe 2012 or 2013, something like that. It's 1500 APIs that were little squares that were red now in Microsoft land. And it just kept growing and growing and growing. So there were people actively patching those because you don't need, in practice, all 1500 of those. If you don't use those 1500, who cares? So practically speaking, you could find which of those you needed and find the polyfills, as long as they existed. And so there are a few people who wrote just gobs and gobs of polyfills. John Neil was one of those people. Do you know John?
  • Eric Meyer: Mm-hmm.
  • Brian Kardell: And Mathias Bynens. Yeah, a lot of those got used in, you could use those with modernizer and a bunch of them were even embedded into the MDN documentation.
  • Eric Meyer: Yeah.
  • Brian Kardell: I don't know if you remember that, but that used to be a thing. Where with array methods, for example-
  • Eric Meyer: Okay, yeah.
  • Brian Kardell: ... there were implementations of the array methods in the MDN where it was like, here's how you check for support and here's how you still enable support if it's not available. Because that was such a common, like I said, it was 90-some percent of your users didn't have support for it. So yeah, there's a lot of thinking going on around polyfills and also around modules. Because don't forget this is before ES 2015.
  • Eric Meyer: Yeah, I mean, so polyfills are super, super useful, right?
  • Brian Kardell: Yeah, they're super useful.
  • Eric Meyer: Really useful. And sometimes that leads us down paths that in retrospect maybe weren't the best. We've had a situation recently where polyfill.io, which is a centralized library service, there's been trouble. Basically, from what I understand, the maintainer sold it, and now whoever they sold it to started serving weirdness up to people who were linking to that. It's like if Google Fonts that most everybody just links straight to the Google Fonts servers and loads the fonts from there. If they started serving corrupted fonts or something like that, or fonts that had executable code embedded in them, which I recently saw was actually a thing you can do. So that has happened, and there are now articles from CloudFlare about how to automatically replace links to polyfil.io with something less troublesome. And just a whole lot of people saying, 'Wow, everything gets ruined.' But there's a part of me that's like, why did you link to a service in the first place? Why did you not have a local copy?
  • Brian Kardell: But this happens all the time. So this originates all the way back in 2014. In 2014, there was an article on hacks.mozilla.org, and it was co-authored by Robert Nyman, who we've done a lot of work with, and Andrew Betts. I don't know if you know Andrew, do you know him?
  • Eric Meyer: I do not think I do.
  • Brian Kardell: So Andrew was very involved in the community. He also was an event organizer. He organized all of the edge comp and stuff like that, that's not about the edge browser.
  • Eric Meyer: Not about the browser.
  • Brian Kardell: And yeah, I think maybe also had a UK meetup, a UK web designers meetup. But then also he served on the W3C Technical Architecture group. And-
  • Eric Meyer: Yeah, maybe I do slightly recognize him. I don't think we've ever met though.
  • Brian Kardell: Yeah. I mean, he's a very smart guy. But anyway, the article introduced this idea of a service, and what the service would do is look at the browser. Actually, it had a few ways that it could work, so I don't want to be unfair to it, but it had a few ways that it could work. But the easiest way, the one that won the day was that it would look at your browser and try to identify your browser and then it would say, 'Well, I know what things are missing,' And it would provide the polyfills for those. And I don't know, this was great and terrible at the same time, in my opinion. I think I was always honest about, I'm not sure that really relying on that is good in the long run, but that wasn't a thought on anybody's mind in 2014, really. I mean, it just got popular and people used it. And then he also was the founder of FT Labs, the Financial Times arm of the technical arm. And so it was very scalable. It didn't involve a very lot of setup. You didn't have to deal with maintenance, if there were issues. Yeah, I mean, it's complicated. One of the comments that I made early on was, 'Well, you're using user agent sniffing. Is that really... That doesn't seem like the best thing.' They addressed that in the article, why they chose it. Because maybe this particular version of IE has a broken implementation of that, so you can't just feature detect it. So I don't know. It was very, very interesting. It featured almost all of the polyfills by the two gentlemen that I mentioned earlier. And there was, again, a lot of other things going on at the same time. About the same time, we were also had established this extensible web community group and been talking about this idea that I was out evangelizing that, hey, here's an interesting idea. If you can polyfill two browsers of three or four, why can't you probably fill four browsers of four? Why can't we experiment like that? And I actually was representing at about this point, jQuery in the CSS working group. Why is it that everybody's like, jQuery has won, why don't we just standardize jQuery? Why is it that we can't do that? And it's like, 'Well, because we didn't plan to do that, so now we can't do it.' Because which version of jQuery are you going to standardize? And if you choose the wrong one, then well, there is no right one because you will break some websites no matter which one you standardize. And there are ideas in jQuery that fit jQuery fine, but actually didn't match the CSS philosophy of things. So jQuery, if you said selector for an ID selector, it would always give you the first one. It would just map that to get element by ID. But there's nothing special about an ID and CSS, in terms of it doesn't actually limit them. So if you have 10 things with that ID and you say they should all be blue, they'll all be blue. And then it takes really long because we can't agree, we can't have ideas that actually compete somehow if you do it in the standard, and we can't be wrong. There's so much fear that we just can't be wrong. And so it was, what if the web was extensible so that you could evolve the web, you could find out what worked and what didn't work. We could run multiple experiments and quickly because you don't need to write C and stuff. So we also registered this website called prolyfill.io, which was a play on words, term that we coined that was like, what if we can't call it a polyfill because polyfill in the metaphor, it's filling these small holes, but this is standing up new walls. So this isn't the polyfill. We don't want people to get confused. This isn't definitely a standard. This is a proposal for a standard. And along the way, we also helped set up Robin Berjon, who we had on recently, was at W3C at the time. And he and I set up a Discord instance that we called Specifiction. And basically this became the incubator community group. The idea that you would want to incubate ideas and say, 'These ones, they're not on a standards track yet, but we would like to figure them out among a group of people that is really serious about it, and maybe make some polyfills and run some tests.' And then along the way, in addition to polyfills, we got origin trials, which is an interesting thing we could talk about maybe.
  • Eric Meyer: Yeah, I mean, in this vision to me, polyfills do for the web in general, what pre-processor have done in CSS land. Which is, there's a lot of, in the pre-process, you can do things like variables and mix-ins and come up with functions that do color shifting or whatever. And then the ones that become popular, because they're useful, eventually make their way into CSS. We have variables. They're not called that, but we have variables and we have mix-ins are coming very pretty soon. That's being worked on now. The color shifting stuff is there and supported in at least some browsers. I think maybe all of the engines now, I'd have to check on that. But that's what pre-processors have done in the CSS space, is allow the community, the wider community to experiment with ideas and for someone to come up with an idea and say, 'Hey, here's this thing. If anyone really would like to use it, here's how you use it.' And then if a bunch of people use it, then eventually the relevant working group, in this case, the CSS working group says, 'Yeah, that should be a thing that we actually have natively.' And so these polyfills or prolyfills in either case, can be that way for CSS, but also for JavaScript, for HTML, where you can write a polyfill that, I don't know, add support for a completely new image format or a new vector image format. And it might be a huge polyfill, but if it turns out to be really useful, then people maybe start using it a lot. And then that is serves as a signal to the standardization world that, 'Hey, maybe this should actually be standardized and made native.' And I really liked that. Actually, I wrote a piece, it's been a lot of years ago now called JavaScript will Save Us All, which was a little bit of maybe hubris, but it was making this argument. It's like, 'Hey, if browsers lag, we can write JavaScript to get them up to date as we've been doing, and we can also speculatively try things and see if we can figure things out. New things that even if the various working groups shrivel up and die, we still have the ability to extend the web.' I never did anything with it. The accessible web people actually, I'm not in any way claiming that my article inspired that, but actually did work towards that. So they actually get the credit. I get the credit for having an idea and publishing a blog post, which is about as low effort as you can get in this space.
  • Brian Kardell: Well, I think it fits in with that whole naming thing because like I said, somebody articulated it, but I don't know that there weren't a ton of people having very similar thoughts. I'm really glad that you mentioned that because right around the same time was the birth of really transpilers, another way that you can try to evolve the web. You can do experiments in transpilers. They can help shape, as you said, like CSS, but also JavaScript. And later, actually while he was on the Technical Architecture Group, TAG has findings that they're not standards, but they are agreed upon by the members of TAG. And this was a piece that was called Polyfills and the Evolution of the Web. It was published 10 days before my birthday in 2017.
  • Eric Meyer: Yeah, okay.
  • Brian Kardell: It talks about how polyfills can be a way for us to help shape the web. And one of the things that they did in this, that I just want to mention is that they explain that whole difference about polyfill. And while it is a useful metaphor to some people, it has problems. And it also recognizes that we have this prolyfill idea and that there were very similar ideas that were slightly different that were called ponyfills and not-a-fill and all these things. And they suggested that those are difficult to pronounce in some languages. Proly was a very of the time kind of word. So it's not the best word. So polyfill, like it or not, was... We were stuck with that. And so they suggested that speculative polyfill is a better name. I agree, actually. It's less fun than prolyfill, but unfortunately it was made after there was a book published by O'Reilly that I actually wrote a blurb for. I didn't write the book Building Polyfills. That's what it was called. It was by Brandon Satrom. It was published in 2014. It talks about prolyfills. So, whoops.
  • Eric Meyer: Yeah, well.
  • Brian Kardell: But this idea of also having one URL that you could use, it also offered where you could... In theory, you're caching these on CDN. So they're on the CDN, there's a finite number of variants and on the CDN, they can just hash those, quickly identify your browser and just look up the right hash and send that back to you. So this is a cache at the edge, where you can have end possible answers, but get the right one very quickly from something that is hopefully geographically close. But there are also no issues with that. I mean, there were other reasons at the time that we wanted to point to a CDN. And it's funny that the lay of the land changes. Because most of those things, most of the other reasons that we used to want to do that, they're not true anymore. Because we used to think, well, if you could point everybody at say, the jQuery CDN to get jQuery, then it's not as good as native, but you can cache just one copy for everybody. But well, caches don't work like that anymore. And there was so much work around that as an idea, but they don't work like that anymore. Websites each all get their own fully independent copy of cache, so there's no sharing. But maybe we can talk about that whole hash idea because that is relevant to a thing that developed at around the same time-
  • Eric Meyer: Go for it.
  • Brian Kardell: ... called Subresource Integrity. And Subresource Integrity had among its original use cases, these similar ideas that was like, we could maybe just cache once for everybody, so you could only have one copy of jQuery. And they were trying to combat this idea that there's a danger in that. Because what if everybody trusts this server and then something happens where somebody compromised this server? Or it doesn't even have to be... Well, I mean, I guess that's still compromising it, right. I mean there's many, many ways that it could be compromised, right?
  • Eric Meyer: Yeah. One of them is it could transfer ownership and the new owners do something-
  • Brian Kardell: Nefarious.
  • Eric Meyer: ... new. Different with it. It's nefarious or otherwise, but yeah, it could be nefarious.
  • Brian Kardell: That's right. It doesn't really have to be nefarious. It just means changing the contract and you're not getting back what you think you agreed to. Yeah.
  • Eric Meyer: So Subresource Integrity, tell me if I am getting this wrong, because I might be. But from what I gather, it's a way of essentially signing things that you're going to be loading. So to go back to that jQuery CDN example where we said, okay, let's set up a CDN, that's just for jQuery, and everybody can point to that. And a more recent version of this would be Google Fonts, where lots of people just point at the Google Fonts site, the CDN. But Subresource Integrity was a way to essentially sign that thing that everyone's pointing to. And then everyone that's pointing to it also has a copy of that signature. And then if it ever changes, then the signature changes and something happens, a warning or a error or something like that. Am I roughly correct?
  • Brian Kardell: Yeah, I mean almost. So it's a hash, so it's just a cryptographic algorithm that you run across it that generates some hash that you would use to uniquely identify this thing. And so there's some well-known hash algorithms in the browser. And you can say, 'I want this to be my JavaScript.' Or I believe you can do it with CSS as well, but only if it ultimately matches this hash. So you go get all the content, you run this cryptographic algorithm across it. And if the number that, not the number, the hash, the mishmash of strings and numbers that you wind up with at the end is the same, then you use it. Otherwise, you throw an exception in the console. I mean, it doesn't interrupt your program otherwise, it just doesn't happen. It's as if it failed.
  • Eric Meyer: Right. Yeah. The browser acts as if it couldn't load the thing.
  • Brian Kardell: Exactly,
  • Eric Meyer: Because it refuses to load the thing, because its hash doesn't match the hash that the browser expects.
  • Brian Kardell: It does tell you why.
  • Eric Meyer: Sure. But from the user end point of view, in a way, it's as if the thing didn't load. So if Google Fonts were using this and the font changed in this, the hash changed, that custom font being loaded from Google wouldn't load. The page would render, but the font wouldn't load until the site was updated or whatever process there was of saying, 'I trust that thing.'
  • Brian Kardell: Yeah, right.
  • Eric Meyer: Okay.
  • Brian Kardell: So yeah, at the time, because we were talking about the caching thing, there was really compelling use case to say, 'Well, man, I can make it really fast because I could be like, okay, well, we want the jQuery 1.3.' So that seemed like a really great idea to me, and I remember being really excited about this, and I put this on the TAG's radar for a TAG review because I thought, 'Wow, that would be so great.' But then somebody pointed out the problem with that, that I hadn't realized, which is that, let's say you go to X, Y, Z porn site and you find some obscure thing that only exists on that website, their logo maybe or whatever. And then we include that on my website. We include that on my website, and if you visit my website by looking at the time it takes to load, I can have a pretty good sense of whether you've been to X, Y, Z porn site.
  • Eric Meyer: Oh, okay. I'm reminded of, yes, the CSS visited problem, but this is a timing attack as opposed to a loading attack, but still, it's a fingerprint. It's an identifier. It's a way to say you have been to X place or you have, whatever.
  • Brian Kardell: Yeah. Anyway, lots of issues like that caused us to eventually be like, 'Nope, no shared caches at all.' Which is a little bit of a shame, to be honest to you. Around the same time we were trying other ways to do that. And I remember that Adios Mani, also from jQuery Standards team had made this thing. Not just him, him and some other people had made this thing called basket.js. And what it tried to do is use local storage for caching and loading and allowed you to build and manage the cache. It turned out that it was actually slower. Anyway, maybe one of those things where you're trying to squeeze every last drop out of it, and you're basing that on the current state of things. But then the state of things changes, and now it's not fast enough anymore. We also then made a version of that called tap.js. There's a few of us that worked on that. That was me and, you have worked a little bit on it, and someone named Leo Strauss. A few other people too were involved in this, but it was a similar thing, but it was trying to do cross domain through an iframe. And I believe, if I'm not wrong, it supported modules out of the box, but those are common JS modules. They're like the kind that pre-existed regular modules today. But yeah, I mean, there was all these ideas about modules and caching and CDNs and polyfills and all this stuff was crashing together. And the state of the world has changed a little bit, not a little, a lot since then. And one of the things that changed was the owner of the domain polyfill.io. So turns out that Andrew never actually owned that. I'm not sure who did own that, but somebody had bought the domain and it led to this big issue. And I mean, that is an issue with domains. Domains are the one part of the web that are centralized. We have centralized domain names, and I think we know this. We know we've seen that happen before.
  • Eric Meyer: Yeah. I mean, it's distributed, but yeah, it's a point of weakness, point of failure, point of something, where if you accidentally let your domain expire, somebody else can grab it and do stuff, nefarious or otherwise. Yeah. And so in this case, I think we touched on it before, but just to recap for those who aren't familiar. There was polyfill.io. And then apparently it got moved into a different CDN, however that happened. And so now you've got, like I said, CloudFlare and Akamai and all these global CDN networks publishing articles about how polyfill.io is now being used in a supply chain attack, is what it's called. Apparently the new owner of polyfill.io has angrily denied that they're participating in a supply chain attack, but everyone from the outside certainly seems to think that something's going on, and it sure looks like a supply chain attack. So things are being served up that are not the original polyfills. Maybe stuff has been added on to the polyfill scripts, or they've just been outright replaced with different scripts. And yeah, the register has an article that says something like, if you're using polyfill.io, as over a 100,000 people are, you need to change your links immediately. And that is, I could imagine something similar happening if NPM were ever compromised in this way or a similar way, or-
  • Brian Kardell: Or I mean CDN.js. There's so many things that are like this that people just use. And another reason that people use them... Well, let's go back to the SRI thing for a minute. So the thing that clashes here, that we get a budding of two ideas that are incompatible in a way, that create a problem is that, if you have a website and you even believe all those answers that were, well, you look at the UA string because there are different versions. And I don't want to maintain this, I want to rely on you to maintain it. That means that you accept whatever they send back to you, at any point in time.
  • Eric Meyer: Yeah, you have to.
  • Brian Kardell: And you used to be able to on jQuery, CDN or any CDN that serve jQuery, used to be able to say, 'Give me the latest jQuery, I just want the latest one.'
  • Eric Meyer: Whatever that is. Yeah.
  • Brian Kardell: Well, that's great until it breaks. And so the question is, in production, what do you do? Because we have this tension today too, even with browsers and things like, you need to get security patches, you need to get updates. You don't want to be reliant on those kinds of things for you to be aware of them. Because it's difficult to even be aware that they were happening for your whole supply chain. So you would point at this CDN and say, 'I trust you to update.' That is totally in conflict with SRI because it means you can't know. You can't know what the thing is. And I think for a lot of people, just trust somebody else to do it, was really appealing. And so that's what they did. And so they couldn't use the SRI even if they wanted to. And then some other people have asked the question, 'Well, if you have a hash for that thing, it means that basically, the only thing that you'll accept is exactly that code. And it's a static file, so why not just put it on your website or your website's CDN rather than rely on some other thing?' So yeah, I don't know. That's all tricky and smashing together. And maybe the really interesting pivot here too is that, because we're talking about hashes and the world is changing and all these things, along the way, a thing that's happened since then is, people have talked about using hashes for a lot of other things. So IPFS uses hashes to identify content, and that and other distributed web things do similar. And they have this idea that's more like peer-to-peer networking where... I mean, not an expert in this area, but if you remember things like Napster and... Well, I don't know, I guess I'm aging myself by naming those kinds of technologies. But where you could have a swarm of people who all had the same file, and you can even be downloading parts of it from different people to optimize so that you're not taking too much of any one of those servers and so on. So that's a pretty interesting idea because you could just say, 'I want jQuery 1.6,' And it doesn't matter where it came from. It doesn't matter because I'm not saying I want jQuery 1.6. I'm saying I want this hash and whoever has this hash, just give it to me. I can get it from anywhere or simultaneously parts of it from six different places, if I wanted to. So I think that's a really interesting idea.
  • Eric Meyer: Yeah, it is. As long as you're confident that there's no hash collision or at least incredibly minimal hash collision risk.
  • Brian Kardell: Do you want to talk about that?
  • Eric Meyer: Yeah. Hash collision is where, if you... Basically, can you have two different things yield the same hash. So if jQuery 1.6 yields this hash, can you also have realbadthings.js yield exactly that same hash. Because if you can, then someone can put up realbadthings.js. And then if you have a system that's just looking for hashes and be like, 'Just give me this hash, I just need this hash whatever. Okay, I'll take that one.' And it turns out not to be jQuery 1.6, it does bad things. It's crypto miner or whatever. So yeah, I mean, people who create hashing algorithms, generally speaking, if they're trying to do a good job, will work very hard to construct a system that is incredibly unlikely to have hash collisions. I think mathematically, you can't absolutely rule them out. Odds of one in many illions, I don't even know how many illions that it would be. Septi quadrillions or something, chance of two unrelated things yielding the same hash. So yeah, there's this desire for trust and reliability, and there's a desire for flexibility. And those two things conflict. As far as I can tell, it's an unresolvable thing. You want the flexibility to just be able to say, 'Just give me the latest version of jQuery from wherever. I don't care where it comes from. I just want the latest version of jQuery from the web or the net.' And being able to do that is fundamentally in conflict with the desire to want to be absolutely certain that you're getting a version of jQuery 1.6 or the version of jQuery 1.6. Those two things just don't go together. Yeah, I mean, I'm sure it's not the only place we see that tension. We always want our jobs to be easier, but at least some cases, the way to make the job easier also makes it more fragile in some ways. Can make the processes less robust and less trustworthy. So we saw that with polyfill.io. And I actually said earlier, what if maybe NPM could be supply chain attacked? And then that twigged in my head, 'Wait, weren't they?' And I went and looked, and yeah, a couple years ago there was a supply chain attack on a few dozen NPM packages, where it started the updates when people just ran their NPM update, whatever, they got the latest version of whatever these packages were. There were, as I say, a couple dozen, few dozen of them, something like that, that had malicious code in them. Or certainly had code that was in no way related to what the package was originally intended to do and what it was installed for. And there were news articles and trying to get the word out as much as possible. Check your packages, check to make sure if you're using one of these packages that you didn't get the malicious code and et cetera, et cetera, et cetera. Because NPM gives you a lot of flexibility and offloads a lot of the burden of maintaining this stuff to somebody else. It becomes somebody else's problem. I don't have to deal with this. I just grab the code and thank them very much for providing the code in my head. But my company probably doesn't actually donate any money to support the people who maintain those packages. And then yeah, they get compromised. Maybe somebody who's been writing some packages is like, 'You know what? Nobody ever donates. Nobody ever supports my Patreon. I've been doing this for free, and now I've got somebody who's willing to pay me $25,000 for my five packages. I'm going to take it.'
  • Brian Kardell: Yeah, or just what I've seen happen in other cases. I mean, there have been a lot of cases like this over the years, but that's one where it gets bought out. Another is just, it was a domain and whoever owned the domain retired or died or just-
  • Eric Meyer: Lost interest or whatever.
  • Brian Kardell: ... forgot that it was on this credit card that was no longer valid and it just expired and they lost the domain. Or in terms of the NPN packages, what happens is Joe Schmo has this idea for a library that maybe is useful and he makes this library that is maybe useful, and suddenly people notice it and it is useful. It gets a community around it and this community then becomes big corporations, like Google, Facebook, Apple, everybody is using at some point through the supply chain, this person's labors for free and nobody is sponsoring it, and it's not their day job. And he's like lots of pull requests to deal with, and it just gets exhausting. And after doing it for years, you get somebody in the community who is helping out a lot. And they offer to take it off your hands and continue the work, but you don't know them. You only know them from their commits, and then it turns out that they were just playing the long game to get access to the thing that's already in everybody's code. So yeah, I mean, it's tricky. And there's again, this tension because from a code standpoint, the hash thing is, I mean, that's what we've always done in a way. If you remember, you would download stuff. It would always say, you get this MD5 check sum and run it and make sure that the thing that you got is this thing. I mean, that's the whole idea here. And for code, that makes a lot of sense technically to make sure that it doesn't break. But also, like you said, practically speaking, who is going through and chasing all of those through the system and reviewing them and making sure that evil didn't creep in at the thing that does left pad on strings?
  • Eric Meyer: Yep. Good old left pad.
  • Brian Kardell: I mean, there's so many things like that. I don't know. It's really tricky. I don't really have a total answer to it either. Part of it, I would say, is coming to grips with, not to sound like a broken record or a constant commercial for the same ideas, but I think coming to grips with the realities of open source. I think we're far enough into the open source era that we can see that the current shape of it is imperfect. We need open source to be actively funded. We need it to not be so one way. We need to think about how we deal with issues of trust and safety. And I think we have some learning to do on this, but another interesting corollary here though is, when you're just asking for content. So if you want an image, if you can ask for it via hash, in theory, that would eliminate the domain problem where somebody can steal the domain. Or not steal it, but just your lease is over if somebody else's now. Or you gave it away, so now somebody else controls it and it's no longer that thing. This is a really big deal for verifiability too. So if you want the web to last a long time and you want it to be able to make sure that this link that I am sending to you, that it will be the thing that I'm looking at right now. I want to save this and make sure that I don't point this for reasons of avoiding misinformation and stuff. So there is also in schema.org, Dan Brickley has similar proposals that maybe this is good for that too. Maybe we should look at hashes as a way to say, for content that I'm linking to even that... Is this what I think it is? And this also gets tricky, not to... I've built systems that use hashes incredibly, and when you start using them for a webpage, boy, that's super tricky because it means that anything that you... What are you hashing, exactly? So if I hash just the content of the HTML, you'll verify that I get the HTML, but somebody can have JavaScript or CSS that modifies the page. So it doesn't mean that I've got the page. Or if I say I want to make sure that this page is exactly what I say it is. Well, what happens if they change the copyright date at the bottom? It is very, very tricky for all of these purposes. But I do just think it's so interesting that we want to build things that are somehow maintainable and safe and performant, and we have efforts in all these areas, but over time, the facts on the ground change too, and it gets... I don't know. They're cross polling each other, but they also make the story difficult for the other ones in a way.
  • Eric Meyer: Yep.
  • Brian Kardell: Yeah.
  • Eric Meyer: As we always say, it's difficult and it depends.
  • Brian Kardell: Yeah.
  • Eric Meyer: What's the solution here? Well, it depends. What do you want to optimize for, basically?
  • Brian Kardell: So yeah, if you're using polyfill.io, you shouldn't be using polyfill.io anymore.
  • Eric Meyer: Yep. Turns out. Hopefully there will be more clarity over time as to what actually happened. And who knows, maybe it will all get resolved and cleaned up. But even if it does, I think the question at the other end of that is, do you go back to using it? I mean, I would argue, no.
  • Brian Kardell: Yeah, I don't know. I don't know that the point of this chat is to suggest solutions as much as-
  • Eric Meyer: Yeah. Just explore.
  • Brian Kardell: ... exploring it and talk about the history and the things that were happening and how we got to that problem.
  • Eric Meyer: Yeah.
  • Brian Kardell: Cool.
  • Eric Meyer: Cool. Brian, thanks.
  • Brian Kardell: Thanks. Good chat.
  • Eric Meyer: Good chat, indeed.