CMOS #10: Michael Kennedy - Talk Python To Me
Hey, this programming stuff is way more fun than math!So I sort of, after a little while, stopped pursuing math and just went into programming, and I've been doing programming for about 20 years. And just everyday, I think, I wake up and,
Well, if I can't listen to one, that's cool, I want these stories to exist, maybe I'll just have to make it myself.So I did and the community has been so supportive, and they really seem to appreciate the conversations and the people I have on the show. I mean, my show is all about the guests, I try to not make it about me. So it's just finding interesting people and projects, and talk about it.
Go do something amazing, and try to make this my full-time job.So in February, I made my podcast my full-time job, and I've been building online courses that go deeper into some of the topics we cover on the podcast, for people. And I've done that mostly through Kickstarter, but also through my online website.
So it's really cool to see both sides and another component that I'd like to throw in there, I think that fascinates me the most, because it really bodes well for open source, is talking to the people who have companies, that are built upon open source projects. And just the amazing stuff that they're doing.
So an example that really surprised me, was this open source library called Scrapy.
So Scrapy is a screen-scraping library for Python, and you can go in and say, suppose you want to go to some kind of website, they don't have an API but the data is clearly accessible with a CSS selector, as most data is.
You can go in and basically turn that webpage into an API with the Scrapy library.
Okay, that's cool, and I talked to probably the guy who created it, and he said,
Now what we're doing is, we create a business around this open source library.
Now if I had to ask you, how would you create a business around an open source library that does web-scraping?
It's great that it's popular, maybe consulting, maybe training, but it's not really too much of that.
What they're doing is they created web-scraping as a service.
So they have all this infrastructure and the re-tryability, and the bandwidth, and the parallelism, to massively scrape the web with their API that you already know, and they sell basically, like AWS sells infrastructure and so on, they sell the ability to do web-scraping.
And I think those combinations of open source projects are the most amazing, because you know that open source project is going to be really vibrant, and really well-maintained, because there's a whole business around it. They have 130 people work in their company, that does that.
hacked,their site got hacked because somebody left a development server up and running, and the server itself was not properly secured. It was not that there was any vulnerability in the code, it's just woops, the testing server was improperly secured. And so if somebody gets hold of your code, that's not good. Right, it's not good if they get hold of your data, but Passlib, what it does, is it employs all the best practices to automatically use the right hashing for one-way encryption on your code. It uses folding and salt, so it doesn't just hash your password once with some salt, it actually will do that 50,000 times, so it's computationally expensive to guess it, and then store that result. And it does all that stuff automatically. So it makes treating user accounts really safely, drop-dead easy, like one or two lines of code. So Passlib, I really like that.
Another one that I think is really cool, that I actually did a whole show about, is this thing called Hypothesis. So, have you heard of property-based testing?
Here's a test, and if I have this user, and they have this ID, or maybe they're this age, and they try to create an account with this email, something will happen.Or, if you tried to purchase this thing for this price, something would happen. And you actually set up those numbers and details. With Hypothesis, you express things like,
I would like to test, with one of the existing users, buying this product, with some number between 0 and 100.And it will try all the permutations and variations, and it will seek out and find those little edge cases, where you're off by one or, if you don't say anything, it'll try to buy it for a negative price, and your system should catch that, but if it doesn't... There's all sorts of interesting things, so instead of writing these examples, and having one by one cases, you give it the relationships of the data, and it automatically tries a bunch of variations. So Hypothesis is amazing for unit testing, really, really nice.
Well, I tried to create an account and then buy something, and that worked and it should have failed because you didn't enter your billing informationor whatever. It'll go through some pretty advanced stuff, but it's not magic. Most of the time, it's just, instead of writing one example test, you might one property-based test and it's really like 100 example tests. Yeah, so Hypothesis, that's definitely one of the cool projects that's out there.
Another one that I'd like to point out, because it's probably the stand-out project in this area, but this area is very interesting right now, is this thing called Pyjion, from Microsoft of all places, for Python.
And what it is, it is an extension to the main, primary Python runtime or implementation on the CPython implementation, that adds JIT capability, as a general concept.
So right now Python is an interpretative language, unless you use other implementations, or runtimes like PyPy, IronPython, Jython.
But those all come with drawbacks,
Oh you can get this really great performance, like Pypy's five times faster, but you can't use a bunch of the libraries you know.
Similarly, for IronPython, and so on.
But this Pyjion thing is basically, instead of forking the implementation and rewriting it, it's trying to create a framework for people to plugin different JIT implementations into the existing one.
So as a community, everybody can come together and work on making the language faster, without forking it with these trade-offs.
Here's an email address, I want this indexed, so I can search quickly and I want it to be unique, so I don't get duplicate registrations, if I have to reset by email.Things like that. So SQLAlchemy, it's great.