Carrying out my Erlang learning plan, I embarked on the porting exercise of my Address DAO package into Erlang Mnesia. In fact, it is more than porting of the data access functions. It is also a porting of the Address database itself from RDBMS into Mnesia.
Creating the Mnesia Schema
The Entity Relationship Diagram (ERD) of the database schema is shown below.
To create the tables in Mnesia, I defined the following records, each corresponding to a table.
-record(address_field_type, {id, name, locale_country, default_value, hierarchy_order, display_order, validation_rules, suffix} ). -record(address_field, {id, value, address_field_type_id, location_code}). -record(address, {id, status, location_code}). -record(address_addressField, { address_id, address_field_id }). -record(oids, { name, id }).
Note that there is an extra record oids
. This record/table is required because Mnesia does not support auto-incrementing fields usually used as primary key generation. Therefore this extra table is used to store the current key value - similar to Oracle's sequence.
Creating the tables are shown below. The oids
table is also populated with seed values.
create_db() -> mnesia:create_table(address_field_type, [ {ram_copies,[nonode@nohost]}, {attributes, record_info(fields, address_field_type)} ]), mnesia:create_table(address_field, [ {ram_copies,[nonode@nohost]}, {index, [location_code]}, {attributes, record_info(fields, address_field)} ]), mnesia:create_table(address, [ {ram_copies,[nonode@nohost]}, {index, [location_code]}, {attributes, record_info(fields, address)} ]), mnesia:create_table(address_addressField, [ {type, bag}, {ram_copies,[nonode@nohost]}, {index, [address_field_id]}, {attributes, record_info(fields, address_addressField)} ]), mnesia:create_table(oids, [ {attributes, record_info(fields, oids)} ]), Fun = fun() -> mnesia:write(#oids{ name=address, id=0 }), mnesia:write(#oids{ name=address_field_type, id=0 }), mnesia:write(#oids{ name=address_field, id=0}) end, mnesia:transaction(Fun).
Simple CRUD
Theaddress_field_type
is a table for storing reference data, which does not rely on other tables. So the C and R of the CRUD is very simple. From address_db.erl
file:
%% returns {atomic, Oid} or {aborted, Reason} generate_oid(TableName) when is_atom(TableName) -> Fun = fun() -> [Oid] = mnesia:read(oids, TableName, write), %% because Erlang only supports single assignment %% I have to create new variables every time the value changes. NewId=Oid#oids.id+1, New = Oid#oids{ id=NewId }, mnesia:write(New), NewId end, mnesia:transaction(Fun). insert_aft(Aft) when is_record(Aft, address_field_type) -> Fun = fun() -> {atomic, Id}=generate_oid(address_field_type), New = Aft#address_field_type{ id=Id }, mnesia:write(New), New end, mnesia:transaction(Fun). find_aft(Id) when is_integer(Id) -> {atomic, [Aft]} = mnesia:transaction(fun()-> mnesia:read({address_field_type, Id}) end), Aft.There are a few things worth noting:
- Although the
mnesia:read
andmnesia:select
functions can be read-only, they still need to be wrapped in a function and called bymnesia:transaction/1
. This does not make sense to me. - The
mnesia:transaction(Fun)
returns{aborted, Reason}
or{atomic, Result}
whereResult
is whatever the return result of theFun
is. Therefore, theinsert_aft/1
above returnsNew
- the newaddress_field_type
record with theid
field populated. - Erlang is single assignment language - i.e. a variable can only be assigned value once. Therefore, if I want to assign the
Aft#address_field_type.id
, I have to create a whole newaddress_field_type
record as in theinsert_aft/1
above. I have to say, I am not a fan of this single assignment - it defeats the half of the purpose of having variables.
A Slightly More Complicated CRUD
Theaddress
entity relies on address_field
, which in turn depends on address_field_type
. Also, if address
records share the same address_field
values, then the address_field
record should be shared and not be duplicated in the database. Therefore, the management of the address
records are more complicated than those of address_field_type
.
The creation of address
records is shown below.
insert_address(A, Afs) when is_record(A, address) and is_list(Afs) -> {A2, Afs2} = address:generate_location_code(A, Afs), Fun = fun() -> % populate the address_field records' id field NewAfs=lists:foldl( fun(Af, NewAfs) -> % check if the same Af is already in database Result = mnesia:select(address_field, [{ #address_field{ id='$1', location_code=Af#address_field.location_code, _='_' }, [], ['$1'] }] ), if length(Result)==0 -> % Af not in DB, so insert it {atomic, Id}=generate_oid(address_field); true -> % Af already exists in database Id=lists:nth(1, Result) end, NewAf = Af#address_field{ id=Id }, lists:append(NewAfs, [NewAf]) end, [], Afs2), % create the new address record {atomic, AddressId} = generate_oid(address), NewA = A2#address{ id = AddressId }, mnesia:write(NewA), % now insert/update into address_field % and insert into address_addressField table lists:foreach( fun(Af) -> mnesia:write(Af), A_Af=#address_addressField{ address_id=AddressId, address_field_id=Af#address_field.id }, mnesia:write(A_Af) end, NewAfs), {NewA, NewAfs} end, mnesia:transaction(Fun).From the
address.erl
file:
%% applying quick sort to AddressField list. sort_afs([]) -> []; sort_afs([Pivot|Tail]) when is_record(Pivot, address_field) -> sort_afs([Af || Af <- Tail, compare_aft(address_db:find_aft(Af#address_field.address_field_type_id), address_db:find_aft(Pivot#address_field.address_field_type_id)) < 0]) ++ [Pivot] ++ sort_afs([Af || Af <- Tail, compare_aft(address_db:find_aft(Af#address_field.address_field_type_id), address_db:find_aft(Pivot#address_field.address_field_type_id)) >= 0]). compare_aft(Aft1, Aft2) when is_record(Aft1, address_field_type) and is_record(Aft2, address_field_type) -> Aft1#address_field_type.hierarchy_order - Aft2#address_field_type.hierarchy_order. generate_location_code(A, AddressFields) when is_record(A, address) and is_list(AddressFields) and (length(AddressFields)>0) -> Afs=generate_location_code(AddressFields), FirstAf=lists:nth(1, Afs), { A#address{ location_code=FirstAf#address_field.location_code }, Afs }. generate_location_code(Afs) -> ReverseSortedAfs=lists:reverse(sort_afs(Afs)), generate_location_code(ReverseSortedAfs, [], []). generate_location_code([Head|Tail], String, NewAfs) when is_record(Head, address_field) -> StringValue = Head#address_field.value ++ String, Code=erlang:phash2(StringValue), generate_location_code(Tail, StringValue, [ Head#address_field{ location_code=Code } ] ++ NewAfs); generate_location_code([], _String, NewAfs) -> NewAfs.The following functions are queries from the
address_db.erl
file:
find_address(Id) when is_integer(Id) -> {atomic, [A]} = mnesia:transaction(fun()-> mnesia:read({address, Id}) end), A. find_address_field(Id) when is_integer(Id) -> {atomic, [Af]} = mnesia:transaction(fun()-> mnesia:read({address_field, Id}) end), Af. find_address_field_ids(A_id) -> mnesia:transaction(fun()-> mnesia:read({address_addressField, A_id}) end). find_address_ids(Af_id) -> F = fun() -> mnesia:select(address_addressField, [{ #address_addressField{ address_id='$1', address_field_id=Af_id, _='_' }, [], ['$1'] }] ) end, mnesia:transaction(F). find_address_ids2(Af_id) -> F = fun() -> Q = qlc:q([Aaf#address_addressField.address_id || Aaf <- mnesia:table(address_addressField), Aaf#address_addressField.address_field_id==Af_id]), qlc:e(Q) end, mnesia:transaction(F).Things that I want to highlight from the above are:
- Unlike Java or C#, Erlang does not have a sortable list out-of-the-box. Instead, I had to write functions that sort my list of
address_field
records - these are:sort_afs, compare_aft
- When searching using the primary key field,
mnesia:read
is used; when searching using other fields, either usemnesia:select
orQLC
. In the above code, the functionsfind_address_ids/1
andfind_address_ids2/1
do the same thing. However, the result (a list) given byfind_address_ids2/1
are reversed. - If a record with the same primary key already exists in the table, the
mnesia:write
either creates (if table type ifset
) or updates (if table type isbag
) the record in the table. - Again, due to the single assignment rule, the
lists:append(List, AnotherList)
does not change the List; instead, the new list is the return value oflists:append/2
. This is just counter-intuitive.
Related Posts:
7 comments:
Hi there.
Here are my notes on your notes ;)
1. There is function to sort lists in stdlib (i.e. out of the box). It is in module lists and called keysort (lists:keysort/2). I haven't jumped deep into your code, but are you sure lists:keysort/2 does not fit you?
2. Read about mnesia:index_read/3
4. Using lists:append/2 in this way is very inefficient:
lists:append(NewAfs, [NewAf])
instead build a list with:
NewAfs2 = [ NewAf | NewAfs ]
and then lists:reverse/1 it.
X = [ A | Y ] appends element A to the front of Y and assigns it to X and it is very efficient in Erlang.
Intuition is very relative thing. Erlang is functional language with immutable variables - intuition is quite different here in comparison to imperative languages ;) That's the reason why learning Erlang is a very good idea! It makes you look at some thing from completely different angle.
Other notes:
1. Are you sure you need address_addressField table. Why not change record 'address' and add field 'fields', which is a list of fields?
-record(address,
{id,
status,
location_code,
fields = []}).
Mnesia stores in fact ANY term, hence there is no problem with storing a list of records in there.
2. Transfering normalized database from RDBM to mnesia is not a good idea. To work efficiently with mnesia it is usually a good idea to denormalize the database.
3. Having numeric ID is absolutely not mandatory in mnesia. If theres a entry of type address_field_type with name "country", why not use field 'name' as PK and have meaningful values in referencing fields (i.e. atom 'country' instead of 0, atom 'province' instead of 1, etc.) ?
4. Sometimes using mnesia:transaction for every operation is just a overkill. Read about mnesia:dirty_* in manual
Best regards,
Gleb Peregud
"I have to say, I am not a fan of this single assignment - it defeats the half of the purpose of having variables."
But it is, afaik, the only way to ensure almost-lock-free concurrency. It has some very deep implications and as a result advantages.
Thanks gleber for the comments. One of the reasons of my blog is to attract expert opinions like this.
As of single assignment, I thought about it this morning and realised that it must be for the purpose of concurrent programming. So I logged in to add my thought, then I saw you point it out already :)
The list:keysort/2 does not work in this case as the key is not in the record, but has to be looked up from the database. I did think about adding that field into the record but decided not to so that the records match the tables structure.
I will follow your advice and give them a try.
cheers
romen
Also, about the "other notes..." on normalised database:
Indeed I used the address_addressField table to normalise the data. I did model the address record to contain a list of address_fields in my previous exercise when trying out Erlang ODBC. Here I wanted to try something different and I kept the record structure faithful to the table to make the database operations easy.
I agree that Mnesia is intended for non-normalised database as opposed to your average 3rd-party RDBMS. I work in the telco industry and do understand the data characteristics of these applications (HLRs, switches, etc. where MSISDN or similar is used for query key so there is no need for other object/record IDs).
Through these exercises I try to understand the scope of the Erlang framework(s) and what are the best use cases for them.
cheers
romen
I see, that's nice that my comments are helpful. Sorry if some of them are plain truisms for you :)
Keep learning Erlang - it is awesome!
Hi,
As far, as sorting is concern - there is another, more general purpose function:
sort(Fun, List1) -> List2
There is a built in way to do this with NewOid = mnesia:dirty_update_counter(oids,table_name,increment value)
This will update your oids table and return the next id.
however you oids table should look like this
-record( oids, {type, id} ).
http://dudefrommangalore.blogspot.com/2009/03/auto-increment-in-mnesia-database.html
Post a Comment