Go Rusty Python
Why Do Programming Languages Have Types?
A walkthrough from untyped to typed
Which is better: Static or Dynamic typing?
Static typing can find errors in the code before running it. Yet, how often do such type errors occur? Moreover, types are not tests, and you must still prove the correctness of your program even with types. Though, a well-typed program will be more reliable and easy to read.
Typing is not only a fun topic to argue but also a tool that can help to make a job done.
I want to make an overview to find out how different kinds of typing are connected. I believe that understanding how types arise will help us to use them more wisely, especially, in the languages where we have a choice.
No Types World
There are no types on low-level. CPU and memory live their lives without knowing what a type is. All they have are zeros and ones.
Machine word is the only type on the low level.
Computer memory stores information in the memory cells of various sizes, and the CPU executes instructions on the machine words. Untyped zeros and ones turn to a 1-typed universe - the bit strings of different sizes.
Low level world is 1-typed, and everything has to be represented as machine words: characters, numbers, pointers, structured data, programs, etc.
When looking at a piece of raw memory, there is no way of telling what is being represented. The meaning of a piece of memory is determined by an external interpretation of its contents.
What is a Type?
Types appear naturally, even starting from untyped. As soon as we start working in a 1-typed world, we begin to organize it in different ways for different purposes. Types arise in any domain to categorize objects according to their usage and behavior. The classification of objects by their purpose results in a type system.
Type is an abstraction expressed in machine words.
When we are talking about types, we often mean something of the following.
- Syntax - a type is a label associated with a variable.
- Domain of a value - possible values that variables can process.
- Representation - how to represent a type, it can be a primitive machine type or a composition of types.
- Behavior - some operations bounded with a given type.
These meanings can be combined. For instance, if we define only a Behavior then we have an Abstract Data Type. When we have Representation + Behavior we get the concrete implementation.
Why are Types useful?
A type system has as a purpose to avoid questions about representations and to describe a proper behavior. Types impose constraints that help to enforce correctness. Also, types are valuable for developers to reason about software structure.
Types are part of the process how we think about things.
If we think about benefits of using a type system we can highlight:
- Abstraction - types help to think at a higher level then bytes;
- Documentation - well defined types can also be a documentation for code;
- Safety - help find invalid code with undefined behavior;
- Optimization - make optimizations in runtime or in compile stages.
Static and dynamic typing
Since we have types, we have to have a process of verifying their requirements and constraints. This process can be done before running a program - statically, or in a runtime - dynamically.
Static Typing
Static typing is the process of proving type correctness by analyzing a source code.
Manifest typing and type inference.
We can manually define a type of every variable in a program then we will have a manifest typing. Or we can rely on type inference rules, and these rules can deduce the types of expressions with little or no type information.
Static typing is a great tool, but the requirement that all variables and expressions are bound to a type in a source code is sometimes too restrictive.
A number of programming language features cannot be checked statically. Among these are dynamic dispatch, late binding, downcasting, and reflection.
Constraint of knowing a type of every variable statically may be replaced by the weaker requirement. All expressions are guaranteed to be type-consistent although the type itself may be statically unknown. This can be generally done by introducing some runtime type checking.
Dynamic Typing
Dynamic typing is the process of checking the type safety of a program at runtime.
Many programming languages include some sort of dynamic type checking, even if they also have a static type checker. The reason is that some properties are hard or even impossible to verify statically.
Even such traditionally static typed languages as C++ and Java have type checks in a runtime. These languages support downcasting types to their subtypes.
During runtime, a program can have various type errors. In some languages such errors are considered fatal, in others, it is possible to recover from these failures.
Strong and weak typing.
Another option is that runtime can make an implicit type conversion. Or if we can use pointers to perform arithmetic operations on them and bypass type restrictions. In such case it is commonly called a weak typing.
Polymorphism
Another super important feature of a language is the ability of a code to operate on values of multiple types. This ability is polymorphism, and it gives the potential to reuse the same code on different types.
Polymorphic and monomorphic languages
A contrast with polymorphic would be a monomorphic language. In these languages, functions and procedures can only have a unique type. Pascal can be an example of such language.
There are a few approaches to implement polymorphism in a language. Anyway, these possibilities are tightly related to typing.
Gradual Typing
Nor static or dynamic type checking is universally better. It would be good to have the possibility to choose a typing without changing a programming language. This concept brings us to gradual typing.
It allows parts of a program to be dynamically typed and other parts to be statically typed.
Historically, this term is used for dynamic languages that introduce possibilities for static analysis. Among these are Python, Typescript, Clojure.
C# can be considered gradually typed as starting from version 4.0 variables can be marked as dynamic.
As we saw combining different typing approaches has existed before this new term.
Conclusion
As we see, there is no right answer to what is better. But it is exciting to see how languages constantly are changing and evolving.
More advanced approaches are used to achieve flexibility, optimality, and expressiveness in the languages.
Those who don’t know the foundations are doomed to constantly argue about a better way. But it is always a trade-off.
Thank you for reading! Share you thoughts with me on LinkedIn and Twitter.
More reading
If you like this article you can be interested in the following.
References
- Luca Cardelli, Peter Wegner, (December 1985). On Understanding Types, Data Abstraction, and Polymorphism.
- Luca Cardelli, 2004, Type Systems
- Robert Martin, The Clean Code Blog, Types and Tests
- Jeremy Siek, What is Gradual Typing