Python is great, but stuff like this just drives me up the wall

@renzev · edit-2 9 months ago

Python is great, but stuff like this just drives me up the wall

@scrion · 9 months ago

I’m not talking about type checking, I’m talking about data validation using pydantic. I just consider mypy / pyright etc. another linting step, that’s not even remotely interesting.

In an environment where a lot of data is being exchanged by various sources, it really has become quite valuable. Give it a try if you haven’t.

@[email protected] · 9 months ago

I understand what you’re saying—I’m saying that data validation is precisely the purpose of parsers (or deserialization) in statically-typed languages. Type-checking is data validation, and parsing is the process of turning untyped, unvalidated data into typed, validated data. And, what’s more, is that you can often get this functionality for free without having to write any code other than your type (if the validation is simple enough, anyway). Pydantic exists to solve a problem of Python’s own making and to reproduce what’s standard in statically-typed languages.

In the case of config files, it’s even possible to do this at compile time, depending on the language. Or in other words, you can statically guarantee that a config file exists at a particular location and deserialize it/validate it into a native data structure all without ever running your actual program. At my day job, all of our app’s configuration lives in Dhall files which get imported and validated into our codebase as a compile-time step, meaning that misconfiguration is a compiler error.

@scrion · edit-2 9 months ago

I am aware of what you are saying, however, I do not agree with your conclusions. Just for the sake of providing context for our discussion, I wrote plenty of code in statically typed languages, starting in a professional capacity some 33 years ago when switching from pure TASM to AT&T C++ 2, so there is no need to convince me of the benefits :)

That being said, I think we’re talking about different use cases here. When I’m talking configuration, I’m talking runtime settings provided by a customer, or service tech in the field - that hardly maps to a compiler error as you mentioned. It’s also better (more flexible / higher abstraction) than simply checking a JSON schema, and I’m personally encountering multiple new, custom JSON documents every week where it has proven to be a real timesaver.

I also do not believe that all data validation can be boiled down to simple type checking - libraries like pydantic handle complex validation cases with interdependencies between attributes, initialization order, and fields that need to be checked by a finite automaton, regex or even custom code. Sure, you can graft that on after the fact, but what the library does is provide a standardized way of handling these cases with (IMHO) minimal clutter. I know you basically made that point, but the example you gave is oversimplified - at least in what I do, I rarely encounter data that can be properly validated by simple type checking. If business logic and domain knowledge has to be part of the validation, I can save a ton of boilerplate code by writing my validations using pydantic.

Type annotations are a completely orthogonal case and I’ll be the first to admit that Python’s type situation is not ideal.