Usage
To use data records, import the decorator from data_records
:
>>> from data_records import datarecord
Decorating a class for use as a DataRecord
>>> from typing import Any, Optional, List
>>> from data_records import datarecord
>>> @datarecord
... class Person:
... name: str
... age: int
... hobbies: List[str] = []
... meta: Optional[Any] = None
... nickname: Optional[str] = None
Much like with @dataclass
put the @datarecord
decorator above your class declaration. On your class, put the names
of the fields on your record, and their required type hints. DataRecords will validate and try to coerce
the data passed in to the specified type and raise an error at initialization instead of silently
storing the incorrect type.
The above code would generate a class that behaves at initialization similar to:
class Person:
def __init__(self, name: str, age: int, hobbies: List[str] = [], meta: Optional[Any] = None, nickname: Optional[str] = None):
self.name = coerce_type(name, str)
self.age = coerce_type(age, int)
self.hobbies = coerce_type(hobbies, List[str])
self.meta = coerce_type(meta, Optional[Any])
self.nickname = coerce_type(nickname, Optional[str])
Behavior
DataRecords have a few desirable behaviors which makes them ideal for use in handling retrieved data of undetermined types.
Type Coercion
A datarecord guarantees the annotated type.
Here zipcode can be parsed to int
from str
, so @datarecord
does that for you
>>> from typing import Optional, List
>>> from data_records import datarecord
>>> @datarecord
... class Address:
... address_1: str
... city: str
... state: str
... zipcode: int
... address_2: Optional[str] = None
>>> data = {'address_1': '123 Any Street', 'city': 'AnyTown', 'state': 'ST', 'zipcode': '12345'}
>>> address_record = Address(**data)
>>> address_record.zipcode
12345
>>> type(address_record.zipcode)
<class 'int'>
The data can even be coerced from string types. Helpful when dealing with backups that lose data types.
>>> from typing import List
>>> from data_records import datarecord
>>> @datarecord
... class Weather:
... city: str
... day: str
... temps: List[float]
>>> data = {'city': 'AnyTown', 'day': 'monday', 'temps': '[60, 63, 65, 70, 68, 62.5]'}
>>> weather_record = Weather(**data)
>>> weather_record.temps
[60.0, 63.0, 65.0, 70.0, 68.0, 62.5]
When the types cannot be properly coerced, it throws an error instead of silently storing the improper type.
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz: int
>>> Foo(bar=1, baz='b')
Traceback (most recent call last):
...
ValueError: invalid literal for int() with base 10: 'b'
Immutability
In other languages, Records are immutable, so datarecords follow that pattern and dont allow for modification of fields after they have been set:
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
>>> foo_record = Foo('test')
>>> foo_record.bar
'test'
>>> foo_record.bar = 'other'
Traceback (most recent call last):
...
data_records.exceptions.CannotMutateRecordError: cannot assign to field 'bar'
>>> del foo_record.bar
Traceback (most recent call last):
...
data_records.exceptions.CannotMutateRecordError: cannot delete to field 'bar'
As a result, in order to update a data record, you need to create a new one and replace the values of certain fields
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
>>> foo_record = Foo('test')
>>> foo_record.bar
'test'
>>> new_foo_record = foo_record.replace(bar='something else')
>>> new_foo_record.bar
'something else'
>>> foo_record.bar
'test'
In line with pattern matching that a lot of other languages that have records have, you can use extract to get the values out of a record in order:
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz: int
... lat: float
... long: float
>>> example = Foo('test', 2, 65.1, -127.5)
>>> latitude, longitude = example.extract('lat', 'long')
>>> latitude
65.1
In a pure data world, if two records have the same values for all of the fields, then they are equal. DataRecords expose this property:
>>> from typing import List
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz: List[str] = []
>>> r1 = Foo('test', '[1,2,3]')
>>> r2 = Foo('test', ['1', '2', '3'])
>>> r1 == r2
True
Builder Methods
Decorated DataRecords come with 2 maker methods to allow for easy mapping.
From Dict
DataRecord.from_dict(data)
is a thing wrapper around DataRecord(**data)
. It is useful to be able to map over a list
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz: str
>>> data = [
... {'bar': 'a', 'baz': 'b'},
... {'bar': 'c', 'baz': 'd'},
... ]
>>> records = list(map(Foo.from_dict, data))
>>> records[0]
Foo(bar='a', baz='b')
From Iter
In a similar way to from_dict
, from_iter
is a helper method which is a thing wrapper around DataRecord(*data)
.
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz: str
>>> data = [
... ['a', 'b'],
... ['c', 'd'],
... ]
>>> records = list(map(Foo.from_iter, data))
>>> records[0]
Foo(bar='a', baz='b')
Notes
Field Ordering
Fields with defaults need to be declared after fields with without. Declaring them otherwise will raise a TypeError exception.
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz: int = 1
... foo: str
Traceback (most recent call last):
...
TypeError: non-default argument 'foo' follows default argument
Missing Annotations
All fields are required to have annotations. If you are not sure, you can use typing.Any
to allow for any data type
to fill that field; however it offers no downstream safety on that field. Missing an annotation will
raise a TypeError.
>>> from data_records import datarecord
>>> @datarecord
... class Foo:
... bar: str
... baz = 1
Traceback (most recent call last):
...
TypeError: 'baz' is a field with no type annotation
Calling with parens
The preferred method for calling data records is without parens, but it will work even with them
>>> from data_records import datarecord
>>> @datarecord()
... class Foo:
... bar: str
>>> @datarecord
... class Bar:
... foo: str